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Abstract 

Recent  literature  has  proposed  employing  a  single  experimental  design  capable  of 
preforming  both  factor  screening  and  response  surface  estimation  when  conducting 
sequential  experiments  is  unrealistic  due  to  time,  budget,  or  other  constraints.  Mil¬ 
itary  systems,  particularly  aerodynamic  systems,  are  complex.  It  is  not  unusual  for 
these  systems  to  exhibit  nonlinear  response  behavior.  Developmental  testing  may  be 
tasked  to  characterize  the  nonlinear  behavior  of  such  systems  while  being  restricted 
in  how  much  testing  can  be  accomplished.  Second-order  screening  designs  provide  a 
means  in  a  single  design  experiment  to  effectively  focus  test  resources  onto  those  fac¬ 
tors  driving  system  performance.  Sponsored  by  the  Office  of  the  Secretary  of  Defense 
(OSD)  in  support  of  the  Science  of  Test  initiative,  this  research  characterizes  and 
adds  to  the  area  of  second-order  screening  designs,  particularly  as  applied  to  defense 
testing.  Existing  design  methods  are  empirically  tested  and  examined  for  robustness. 
The  leading  design  method,  a  method  that  is  very  run  efficient,  is  extended  to  over¬ 
come  limitations  when  screening  for  non-linear  effects.  A  case  study  and  screening 
design  guidance  for  defense  testers  is  also  provided. 
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A  COMPARISON  STUDY  OF  SECOND-ORDER  SCREENING  DESIGNS  AND 


THEIR  EXTENSION 

I.  Introduction 


1.1  Background 

Shrinking  budgets,  in  conjunction  with  the  rising  costs  associated  with  replacing 
aging  military  hardware,  have  highlighted  the  necessity  for  Department  of  Defense 
(DOD)  organizations  to  demonstrate  fiscal  responsibility  while  still  maintaining  core 
capabilities.  As  a  result,  the  DOD  continues  to  look  for  methods  which  promote  effi¬ 
ciencies  in  all  its  operations.  As  such  in  April  2012,  the  Scientific  Test  and  Analysis 
Techniques  in  Test  &  Evaluation  Center  of  Excellence  (STAT  T&E  COE)  was  estab¬ 
lished  at  the  Air  Force  Institute  of  Technology  Graduate  School  of  Engineering  and 
Management  by  the  Deputy  Assistant  Secretary  of  Defense  for  Developmental  Test 
and  Evaluation  and  Director,  Air  Force  Test  and  Evaluation.  Dr.  Steven  Hutchison, 
Principal  Deputy,  Office  of  the  Deputy  Assistant  Secretary  of  Defense  for  Develop¬ 
mental  Test  and  Evaluation  (DASD(DT&E)),  stated  “By  applying  scientific  methods 
to  the  test  design,  we  can  not  only  achieve  great  efficiencies,  but  we  can  significantly 
improve  confidence  in  our  results.  The  STAT  T&E  COE  will  provide  a  critical  venue 
for  enhancing  the  test  design  for  DOD  acquisition  programs.” 

Prior  to  the  establishment  of  the  STAT  T&E  COE,  Dr.  J.  Michael  Gilmore, 
Director  of  Operational  Test  and  Evaluation  (DOT&E),  started  an  “initiative  to  in¬ 
crease  the  use  of  scientific  and  statistical  methods  in  developing  rigorous,  defensible 
test  plans  and  in  evaluating  their  results”  within  OT&E  (Gilmore,  2010).  In  a  2010 
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memorandum,  Dr.  Gilmore  provided  key  policy  guidance  on  the  use  of  Design  of  Ex¬ 
periments  (DOE)  in  OT&E.  Furthermore  the  DOT&E  Scientific  Advisor  (SA),  Dr. 
Catherine  Warner,  highlighted  the  fact  that  while  DOE  is  a  structured,  rigourous  sta¬ 
tistical  tool  for  test  planning  and  analysis,  and  it  has  been  written  about  extensively 
within  the  academic  setting,  there  are  still  many  questions  regarding  how  to  apply 
DOE  to  T&E  within  DOD  (Warner,  2011). 

An  ongoing  effort,  beginning  in  2009,  which  focuses  on  transitioning  basic  science 
of  test  techniques  and  test  methodology  to  DOD  practice  is  the  “Science  of  Test”  ini¬ 
tiative.  Funded  by  OSD  DOT&E  in  2011,  the  member  institutes  which  comprise  this 
research  consortium  are  Arizona  State  University,  Virginia  Tech,  Naval  Postgraduate 
School,  and  the  AF1T  Center  of  Operational  Analysis. 

This  dissertation  directly  supports  the  “Science  of  Test”  initiative.  In  particular, 
this  research  addresses  the  use  of  design  of  experiments  and  response  surface  designs 
to  characterize  the  area  of  second-order  screening  designs,  particularly  as  applied  to 
defense  testing.  Extensions  to  existing  designs  are  examined  with  respect  to  improve¬ 
ments  in  robustness  and  applicability  to  defense  testing. 

1.2  Problem  Context 

Response  surface  methodolgoy  (RSM)  is  a  collection  of  statistical  design  and  nu¬ 
merical  optimization  techniques  used  to  model  a  surface  as  an  approximation  for 
the  relationship  between  a  process  or  system  response  and  its  input  factors  (Myers 
and  Anderson-Cook,  2009).  The  shape  of  the  estimated  surface  is  determined  by  the 
model  selected  to  approximate  the  system  and  the  response  values  recorded  from  vari¬ 
ous  input  factor  settings.  The  assumption  is  that  a  response  ij  is  an  unknown  function 
of  a  set  of  design  variables  x±,  X2, ...,  Xk  and  that  the  function  can  be  approximated 
by  a  polynomial  model.  Prominent  among  the  models  considered  are  the  first-order 
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model 


rj  =  fa  +  fiixi  +  ...  +  j3kxk  (1.1) 

and  the  second-order  model 

k  k  k— 1  k 

V  =  A)  +  ^2  PiXi  +  ^2  ^2  Pi3XiX0  (!-2) 

2—1  2—1  2=1 

Box  and  Wilson  (1951)  laid  the  foundation  for  RSM  by  outlining  a  philosophy 
of  sequential  experimentation  which  included  experiments  for  screening,  region  seek¬ 
ing  (such  as  steepest  ascent),  process/product  characterization,  and  process/product 
optimization  (Myers  et  al.,  2004).  Box  and  Liu  (1999)  illustrated  a  number  of  con¬ 
cepts  which  Box  understood  as  the  embodiment  of  RSM  at  the  time  to  include  the 
philosophy  of  sequential  learning. 

As  such,  the  standard  RSM  approach  is  to  use  a  three-stage  process;  however, 
there  are  times  when  the  sequential  nature  can  be  a  disadvantage,  especially  when 
the  duration  of  an  experiment  is  long  or  experimental  preparation  is  time-consuming. 
In  these  instances,  it  would  be  better,  if  not  necessary,  to  perform  factor  screening 
and  response  surface  exploration  on  the  same  experiment  vice  conducting  experiments 
sequentially. 

1.3  Problem  Statement 

Military  systems,  particularly  aerodynamic  systems,  are  complex.  It  is  not  unusual 
for  these  systems  to  exhibit  nonlinear  behavior.  Developmental  testing  may  be  tasked 
to  characterize  the  nonlinear  behavior  of  such  systems  but  may  also  be  restricted  in 
how  much  testing  can  be  accomplished.  Second-order  screening  designs  for  nonlinear 
system  responses  provide  a  means  to  effectively  focus  test  resources  onto  those  factors 
driving  system  performance. 
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Second-order  screening  design  methodology,  sometimes  referred  to  as  One-Step 
Response  Surface  Methodology  or  Definitive  Screening,  is  a  relatively  new  focus  in 
statistical  research  and  effectively  unknown  to  the  defense  test  community.  Important 
questions  as  to  the  method’s  usefulness  and  applicability  remain  unaddressed  and  so 
are  examined  in  this  research. 

1.4  Research  Objective  and  Scope 

This  research  will  characterize  and  add  to  the  area  of  second-order  screening  de¬ 
signs,  particularly  as  applied  to  defense  testing.  Existing  design  methods  are  tested 
and  examined  for  robustness.  Extensions  to  existing  designs  are  examined  with  re¬ 
spect  to  improvements  in  robustness  and  applicability  to  defense  testing. 

We  proceed  with  the  following  goals: 

1.  Conduct  an  empirical  study  to  characterize  and  better  understand  new  proposed 
second-order  screening  designs. 

2.  Identify  second-order  screening  design  with  favorable  design  parameter  proper¬ 
ties  either  through  augmentation  of  existing  designs  or  through  creation  of  new 
designs. 

3.  Development  of  guidelines  for  use  of  second-order  screening  designs  for  DOD 
tests. 

By  accomplishing  these  research  goals,  we  can  help  make  test  managers  within 
the  DOD  comfortable  with  implementing  DOE  techniques  capable  of  examining  the 
complex  nature  of  military  systems  within  fiscal,  time,  and  resource  constraints. 


4 


1.5  Overview 


The  remainder  of  this  dissertation  follows  a  scholarly  article  format.  Chapter 
II  contains  a  detailed  literature  review  of  screening  and  response  surface  designs, 
partitioned  by  sequential  and  single  phase  methods  for  fitting  first  order  and  second- 
order  response  surfaces.  Chapters  III,  IV,  and  V  are  self-contained  research  articles 
on  second-order  screening  designs.  Each  contains  a  literature  review  of  the  research 
relevant  to  that  chapter.  The  original  contribution  of  each  chapter  is  as  follows: 

Chapter  III  formally  examines  the  robustness  of  the  two  arguably  best  second- 
order  screening  designs  with  respect  to  the  assumptions  of  both  sparsity  (factor  or 
effect)  and  heredity  (strong  or  weak).  To  date,  evaluation  of  screening  design  per¬ 
formance  has  assumed  both  factor  sparsity  and  strong  effect  heredity.  The  article  is 
currently  under  review  for  publication  in  Quality  and  Reliability  Engineering  Inter¬ 
national. 

Chapter  IV  describes  a  computer  generated  D— optimality  design  augmentation 
technique  which  uses  a  k— factor  Definitive  Screening  Design  (DSD)  as  a  baseline 
fixed  design  and  augments  the  design  with  k  —  1  additional  runs.  In  a  simulation 
study,  the  proposed  augmented  Definitive  Screening  Designs  (DSD+)  were  able  to 
increase  the  robustness  of  the  original  DSD  to  the  principles  of  heredity  and  sparsity 
while  also  increasing  the  detection  rate  of  second-order  effects  when  both  two-factor 
interactions  and  pure-quadratic  effects  are  active.  The  article  is  currently  under 
review  for  publication  in  the  Journal  of  Quality  Technology. 

Chapter  V  presents  the  use  of  design  of  experiments  and  response  surface  de¬ 
signs  in  the  area  of  second-order  screening  designs,  particularly  as  applied  to  defense 
testing,  through  demonstrating  the  viable  use  of  second-order  screening  designs  in  a 
wind  tunnel  case  study.  The  article  is  currently  targeted  for  publication  in  Military 
Operations  Research  or  the  Journal  of  Defense  Modeling  and  Simulation. 
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And  lastly,  Chapter  VI  reiterates  the  importance  of  studying  second-order  screen¬ 
ing  designs,  summarizes  all  original  research  contributions,  and  provides  suggestions 
for  future  work. 


6 


II.  Literature  Review 


This  chapter  covers  the  literature  pertinent  to  this  research  effort.  After  a  brief 
synopsis  of  Design  of  Experiments  (DOE)  and  Response  Surface  Methodology  (RSM), 
including  common  designs  and  terminology,  the  focus  of  the  chapter  shifts  to  rele¬ 
vant  literature  on  second-order  screening  designs.  Research  on  second-order  screening 
designs  falls  into  two  broad  categories;  construction  and  design  assessment /analysis 
methods.  Both  areas  are  extensively  reviewed  with  gaps  and  limitations  being  dis¬ 
cussed. 

2.1  Design  of  Experiments  (DOE) 

DOE,  or  experimental  design,  is  a  statistical  technique  used  to  organize  an  exper¬ 
imental  test  or  series  of  tests  so  that  observed  changes  in  an  output  response  can  be 
attributed  to  systematic  changes  made  to  the  input  variables  of  a  process  or  system 
(Montgomery,  2013).  While  the  designs  are  based  upon  statistical  techniques,  the 
actual  design  forms  vary  greatly  dependent  upon  the  experimental  objective.  For 
instance,  the  objectives  of  screening,  modeling,  or  optimizing  a  process  or  system  can 
result  in  vastly  different  designs. 

Design  selection  also  depends  upon  the  form  of  the  empirical  model  used  to  rep¬ 
resent  the  process  or  system  response.  Typically,  first-order  polynomial  models  are 
used  extensively  in  screening  experiments  while  second-order  polynomial  models  are 
commonly  used  in  modeling  and  optimization  experiments.  Inherent  within  the  de¬ 
signs  and  execution  are  data  collection  plans  enabling  the  application  of  subsequent 
statistical  analysis  methods  to  reach  valid  and  objective  conclusions. 
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For  the  case  of  two  independent  factors,  the  first-order  polynomial  or  main 


effects  model  is 


y  —  P o  +  Pi  A  +  /?2  B  +  e 


(2.1) 


where  y  is  the  response,  A  and  B  are  the  design  factors,  the  /3s  are  unknown  estimable 
parameters,  and  e  is  a  random  error  term  accounting  for  the  experimental  error  in 
the  system.  An  interaction  term  is  usually  added  to  the  first-order  model  yielding 


(2.2) 


y  —  P  o  +  PiA  +  P2B  +  P12AB  +  e 


where  the  P12  represents  the  two-factor  interaction  effect  between  the  design  factors 
A  and  B.  The  second-order  polynomial  model  with  two  factors  is 


V  —  Po  +  P\A  +  P2B  +  P12AB  +  PuA"  +  P22B2  +  £. 


(2.3) 


Second-order  models  are  often  used  for  response  surface  exploration  (Montgomery, 
2013).  More  general  forms  are  given  in  Equations  1.1  and  1.2. 

2.1.1  Screening  Designs. 

Many  experiments  may  start  by  considering  many  factors,  which  in  turn  increases 
the  overall  size  and  cost  of  the  experiment.  Screening  designs  are  a  category  of 
experimental  designs,  usually  performed  during  the  early  stages  of  a  process  or  system 
study,  used  to  determine  which  of  the  many  factors  (if  any)  have  a  significant  effect 
on  the  system  or  process.  Screening  designs  usually  assume  a  linear  (main  effects  or 
main  effects  plus  interaction)  response  function  so  factors  can  be  studied  at  two  levels 
and  thereby  conserving  experimental  resources. 

Popular  experimental  designs  used  in  screening  experiments  are  full  and  fractional 
2-level  factorial  designs,  Plackett-Burman,  and  supersaturated  designs.  While  all  of 


these  designs  are  capable  of  identifying  the  main  effects,  only  the  full  factorial  design 
is  capable  of  identifying  all  interactions.  To  a  varying  degree,  the  remaining  designs 
are  capable  of  identifying  some  or  all  two-factor  interaction  effects. 

2.1.2  Response  Surface  Designs  (RSD). 

Response  surface  designs  are  experimental  designs  used  when  the  response  surface 
is  believed  to  possess  significant  curvature.  In  order  to  estimate  curvature,  each  factor 
needs  at  least  three  levels.  Response  surface  designs  fulfill  this  requirement  through 
augmentation  of  two-level  regular  designs  or  by  specifying  designs  robust  to  the  linear 
effect  assumption.  Response  surface  designs  are  called  second-order  designs  because 
all  {k  +  l)(fc  +  2)/2  parameters  in  Equation  1.2  are  estimable  in  the  design. 

A  3fc  or  3k~p  fractional  factorial  design  is  often  suggested  to  deal  with  response  cur¬ 
vature.  However,  more  efficient  options  are  available  including  the  Central  Composite 
Design  (CCD),  Box-Behnken  Design  (BBD),  and  saturated/near-saturated  Hoke,  Hy¬ 
brid,  and  Small  Composite  Designs  (SCD). 

2. 1.2.1  Response  Surface  Methodology  (RSM). 

Since  Box  and  Wilson  (1951)  laid  the  foundations  for  RSM,  four  comprehensive 
historical  reviews  have  been  written.  Hill  and  Hunter  (1966)  provided  a  comprehen¬ 
sive  bibliography  while  focusing  on  the  practical  applications  of  RSM  in  the  fields  of 
chemistry  and  chemical  engineering.  The  Mead  and  Pike  (1975)  review  focused  on 
RSM  as  it  applied  to  the  modeling  of  biological  data  in  the  field  of  biometrics.  Myers 
et  al.  (1989)  reviewed  important  developments  in  RSM  during  the  1970s  and  1980s 
while  clearly  defining  RSM  as  “being  confined  to  that  of  a  collection  of  tools  in  design 
or  data  analysis  that  enhance  the  exploration  of  a  region  of  design  variables  in  one  or 
more  responses”  (Myers  et  ah,  1989). 
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Myers  et  al.  (2004)  provide  the  most  current  comprehensive  review  of  RSM  through 
discussions  on  advancements  in  robust  parameter  design  and  new  developments  in  re¬ 
sponse  surface  design  to  include  methods  for  evaluating  response  surface  designs. 
Additionally,  Myers  et  al.  (2004)  address  both  design  and  optimization  issues  for 
multiple  responses  and  the  application  of  generalized  linear  models. 

Unfortunately,  the  nature  of  DOE  and  RSM  sometimes  makes  it  difficult  to  dif¬ 
ferentiate  or  draw  a  clear  distinction  between  the  two.  Whereas  DOE  is  comprised 
of  RSDs  used  for  response  surface  models  which  include  quadratic  terms  for  curva¬ 
ture,  RSM  employs  DOE  screening  designs.  As  the  RSM  name  implies,  RSM  is  best 
viewed  in  context  as  a  methodology  which  employs  DOE  elements  with  the  goal  of 
determining  how  changes  in  design  variables  can  provide  process  improvement  or  op¬ 
timization.  As  such,  the  standard  RSM  can  be  described  as  consisting  of  two  stages: 
factor  screening  and  response  surface  exploration. 

Traditionally,  the  research  for  factor  screening  and  for  response  surface  exploration 
proceed  not  in  concert  but  along  separate  avenues.  The  former  involves  concepts 
like  design  resolution,  minimum  aberration,  power,  and  the  number  of  clear  (non- 
confounded)  effects,  and  the  latter  involves  the  concepts  like  rotatability,  alphabetical- 
optimality,  and  prediction  variance. 

2.1.3  Design  Resolution. 

Resolution  is  a  measure  of  the  degree  of  confounding  for  main  effects  and  inter¬ 
actions  in  a  fractional  factorial  design.  Resolution  is  generally  denoted  in  Roman 
numerals.  The  smallest  useful  resolution  is  III,  and  a  design  can  technically  have  res¬ 
olutions  has  high  as  k  +  1.  Designs  of  resolution  III,  IV,  and  V  are  most  prevalently 
used  because  of  the  nature  of  confounding  found  within  the  designs.  The  confounding 
characteristics  of  these  design  resolutions  are: 
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•  Res  III:  Main  effects  clear  of  other  main  effects,  at  least  one  main  effect  is 
confounded  with  at  least  one  two-way  interaction. 

•  Res  IV:  Main  effects  are  clear  of  two-way  interactions,  but  at  least  one  two-way 
interaction  is  confounded  with  at  least  one  other  two-way  interaction. 

•  Res  V:  Main  effects  and  two-way  interaction  are  clear  of  any  other  main  effect 
or  two-way  interaction,  but  at  least  one  two-way  interaction  is  confounded  with 
at  least  one  three-way  interaction. 

As  an  example,  a  design  which  confounds  a  variable  A  with  a  two-way  interaction  BC 
would  at  best  be  a  resolution  III  design  where  A  and  BC  are  correlated  or  aliased, 
and  therefore  their  effects  can n not  be  independently  quantified.  Usually  the  design 
that  has  the  highest  resolution  possible,  while  meeting  the  required  fractionation  for 
design  run  size  consideration,  is  employed. 

There  are  times  however  that  different  designs  can  possess  the  same  resolution  and 
fractionation  but  have  different  confounding  or  aliasing  structure.  Fries  and  Hunter 
(1980)  proposed  the  concept  of  design  aberration  for  regular  two-level  designs  as  a 
means  to  differentiate  between  these  designs.  They  defined  a  minimum  aberration 
design  as  the  design  of  maximum  resolution  R  “which  minimizes  the  number  of  words 
in  the  defining  relation  that  are  of  minimum  length” .  Since  Fries  and  Hunter’s  initial 
work,  the  concept  of  minimum  aberration  criterion  has  been  extended  to  two-level 
non-regular,  multilevel,  and  mixed-level  fractional-factorial  designs  (Guo  et  ah,  2009). 

2.1.4  Optimality  Criteria. 

Optimal  designs  are  typically  assessed  based  upon  specific  criteria  like  providing 
good  estimation  of  model  parameters  or  good  prediction  capacity  within  the  design  re¬ 
gion.  Alphabetic-optimality  refers  to  the  family  of  design  optimality  criteria  that  are 
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characterized  by  a  letter  of  the  alphabet,  currently  A—,  D-,  G-,  V—,  or  I—.  These 
alphabetical-optimality  criteria  drive  what  constitutes  an  optimal  design.  These  opti¬ 
mal  designs  are  rather  focused  on  a  particular  design  characteristic.  Two  of  the  most 
popular  methods  of  characterizing  optimality  are  /—  and  D— optimality. 

D— optimality  is  based  upon  the  notion  of  selecting  design  runs  which  maximize 
the  determinant  of  X'X,  denoted  as  |X'X|,  where  X  is  the  model  matrix  consisting 
of  the  levels  of  the  design  matrix  D  expanded  to  model  form.  By  selecting  design 
runs  which  maximize  |X'X|  or  minimize  |(X'X)-1|,  D— optimal  designs  minimize  the 
volume  of  the  joint  confidence  region  on  the  vector  of  the  model  regression  coefficients 
(3.  Hence  D— optimal  designs  focus  on  producing  designs  which  provide  good  model 
parameter  estimates. 

A— optimality  focuses  on  producing  good  model  parameter  estimates  by  minimiz¬ 
ing  the  trace  of  X'X-1,  denoted  tr(X'X)-1.  In  contrast  to  D— optimal  designs  which 
consider  the  covariances  among  coefficients  through  examining  |X'X|,  A— optimal  de¬ 
signs  deal  with  only  the  diagonals  of  (X'X)-1,  which  are  related  to  the  individual 
variances  of  the  regression  coefficients  (Montgomery,  2013). 

While  D—  and  A— optimal  designs  focus  on  good  model  parameter  estimates, 
G-,  V—,  and  /—optimal  designs  focus  on  good  prediction  capacity  within  the  design 
region  by  focusing  on  the  scaled  prediction  variance  lVVar[i/(x)]/cr2  =  z/(x).  The 
G— optimality  criterion  is  based  upon  the  maximum  z/(x)  over  the  entire  design  region, 
the  /—optimality  criteria  is  based  upon  the  i'(x)  over  a  region  of  interest,  and  the 
V— optimality  criteria  is  based  upon  the  z/(x)  for  a  specified  set  of  points  in  the  design 
region.  A  design  is  considered  G— optimal  if  the  maximum  value  of  the  i/(x)  over  the 
design  region  is  a  minimum,  while  a  design  is  considered  V— optimal  if  it  minimizes 
the  average  is(x)  over  a  set  of  points  of  interest  in  the  design  region.  Finally,  a  design 
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is  considered  /—optimal  if  it  minimizes  the  average  z/(x)  over  the  design  region  for 
the  regression  model. 

Since  G-,  V—,  and  /—  criteria  are  prediction-oriented  and  A —  and  D—  criteria 
are  parameter-oriented  criteria,  the  G— ,  V— ,  and  /—  criteria  are  mostly  used  for 
second-order  designs  while  the  A—  and  D—  criteria  are  mostly  used  for  first-order 
designs.  While  the  G —  and  D—  criteria  are  widely  seen  throughout  literature,  the 
G— criterion  can  become  computationally  difficult  as  the  design  matrix  grows.  Fortu¬ 
nately,  the  /—criterion  is  computationally  easier  to  implement  than  the  G— criterion, 
and  is  available  in  several  software  programs  (Montgomery,  2013).  For  more  on 
alphabetic-optimality,  please  see  Chapter  8  in  (Myers  and  Anderson-Cook,  2009). 


2.2  Full  Model  Estimable  Designs 


Designs  which  are  full  model  estimable  are  designs  which  can  estimate  all  fac¬ 
tors  within  the  form  of  the  empirical  model  used  to  represent  the  process  or  system 
response.  For  a  second-order  polynomial  model,  the  design  must  contain  enough 
degrees  of  freedom  to  estimate  p  effects  where 


p  =  1  +  2k  + 


k(k-  1) 
2 


( k  +  1)  (k  +  2) 
2 


Recall  a  second-order  polynomial  model  contains  1  intercept,  k  main  effects,  k  pure 
quadratic,  and  k{k  —  l)/2  two-factor  interaction  terms  for  k  factors  (Myers  and 
Anderson-Cook,  2009). 

2.2.1  2k  and  3k  Factorial  Designs. 

In  contrast  to  the  One  Factor  at  Time  (OFAT)  design  strategy  where  factors  are 
varied  individually,  Factorial  Designs  vary  factors  simultaneously  thus  allowing  for 
estimates  of  interactions  between  factors.  If  measurements  are  made  on  the  system 
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or  process  response  for  all  possible  combinations  of  the  values  or  levels  of  the  different 
factors,  the  design  plan  is  called  a  full  factorial  design  experiment  (Connor  and  Zelen, 
1959).  For  example,  if  two  factors  A  and  B  have  a  and  b  levels,  respectively,  each 
factorial  design  replication  would  contain  ab  treatment  combinations  while  adding  an 
additional  factor  C  with  c  levels  would  require  abc  treatment  combinations. 

The  2k  Factorial  Design  consists  of  k  factors  each  at  only  two  levels  and  is  a  special 
case  of  the  full  factorial  design  with  2k  observations  per  replication.  2k  designs  have 
many  useful  properties.  In  addition  to  being  orthogonal,  2k  designs  are  A—,  G-,  D— 
and  /—optimal  for  fitting  a  first-order  model  or  first-order  model  with  interactions 
(Montgomery,  2013).  The  2fc-type  designs  are  widely  used  for  factor  screening  as  it 
provides  the  smallest  number  of  runs  for  independently  estimating  all  main  effects 
and  interactions  for  k  factors.  In  total,  the  number  of  estimable  effects  for  a  2k 
design  is  2k  —  1  consisting  of  k  main  effects,  Q)  two-way  interactions,  Q)  three-way 
interactions,  ...  ,  and  1  k- way  interactions.  A  23  design  with  three  factors,  denoted 
A,  B ,  and  C,  can  estimate  k  =  3  main  effects  (A,  B,  C ),  (;’)  =  3  two-way  interactions 
(AB,AC,BC),  and  one  3-way  interaction  (ABC).  See  Table  1  for  the  23  design 
matrix. 


Table  1.  A  23  Full  Factorial  Design 


Run 

A 

B 

C 

AB 

AC 

BC 

ABC 

1 

— 

— 

— 

+ 

+ 

+ 

— 

2 

+ 

— 

— 

— 

— 

+ 

+ 

3 

— 

+ 

— 

— 

+ 

— 

+ 

4 

+ 

+ 

— 

+ 

— 

— 

— 

5 

— 

— 

+ 

+ 

— 

— 

+ 

6 

+ 

— 

+ 

— 

+ 

— 

— 

7 

— 

+ 

+ 

— 

— 

+ 

— 

8 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

Note:  Factor  settings  have  been  coded,  replacing  the  low  setting 
by  —  and  the  high  setting  by  +. 
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The  3k  Factorial  Design,  which  consists  of  k  factors  each  at  only  three  levels,  is  a 
special  case  of  the  full  factorial  design  with  N  =  3k  observations  per  replication.  See 
Table  2  for  a  33  design  matrix. 


Table  2.  A  33  Full  Factorial  Design 


Run 

A 

B 

C 

A2 

B2 

C2 

AB 

AC 

BC 

ABC 

1 

0 

0 

0 

2 

2 

2 

2 

2 

2 

0 

2 

0 

0 

1 

2 

2 

1 

2 

1 

1 

1 

3 

0 

0 

2 

2 

2 

2 

2 

0 

0 

2 

4 

0 

1 

0 

2 

1 

2 

1 

2 

1 

1 

5 

0 

1 

1 

2 

1 

1 

1 

1 

1 

1 

6 

0 

1 

2 

2 

1 

2 

1 

0 

1 

1 

7 

0 

2 

0 

2 

2 

2 

0 

2 

0 

2 

8 

0 

2 

1 

2 

2 

1 

0 

1 

1 

1 

9 

0 

2 

2 

2 

2 

2 

0 

0 

2 

0 

10 

1 

0 

0 

1 

2 

2 

1 

1 

2 

1 

11 

1 

0 

1 

1 

2 

1 

1 

1 

1 

1 

12 

1 

0 

2 

1 

2 

2 

1 

1 

0 

1 

13 

1 

1 

0 

1 

1 

2 

1 

1 

1 

1 

14 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

15 

1 

1 

2 

1 

1 

2 

1 

1 

1 

1 

16 

1 

2 

0 

1 

2 

2 

1 

1 

0 

1 

17 

1 

2 

1 

1 

2 

1 

1 

1 

1 

1 

18 

1 

2 

2 

1 

2 

2 

1 

1 

2 

1 

19 

2 

0 

0 

2 

2 

2 

0 

0 

2 

2 

20 

2 

0 

1 

2 

2 

1 

0 

1 

1 

1 

21 

2 

0 

2 

2 

2 

2 

0 

2 

0 

0 

22 

2 

1 

0 

2 

1 

2 

1 

0 

1 

1 

23 

2 

1 

1 

2 

1 

1 

1 

1 

1 

1 

24 

2 

1 

2 

2 

1 

2 

1 

2 

1 

1 

25 

2 

2 

0 

2 

2 

2 

2 

0 

0 

0 

26 

2 

2 

1 

2 

2 

1 

2 

1 

1 

1 

27 

2 

2 

2 

2 

2 

2 

2 

2 

2 

2 

Note:  Factor  settings  have  been  coded,  replacing  the  low  setting  by  0,  intermediate  setting 
by  1,  and  the  high  setting  by  2. 


The  addition  of  a  third  factor  level  over  the  2k  design  allows  modeling  the  response 
surface  as  a  quadratic  function.  Each  main  effect  has  2  degrees  of  freedom  used  to  esti¬ 
mate  a  first-order  (linear)  and  second-order  (quadratic)  component.  While  each  two- 
way  interaction  has  4  degrees  of  freedom,  one  for  each  linear  x  linear,  linear  x  quadratic, 
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quadratic x linear,  and  quadratic x quadratic  effect.  In  total,  the  number  of  estimable 
effects  for  a  3fc  design  is  3fc  —  1  consisting  of  k  main  effects,  k  pure  quadratic  effects, 
(2)  two-way  interactions  with  four  degrees  of  freedom,  Q)  three-way  interactions  with 
eight  degrees  of  freedom,  ...  ,  and  1  k- way  interactions  with  2k  degrees  of  freedom. 
For  example,  a  design  with  three  factors,  denoted  A,  B,  and  C,  can  estimate  k  —  3 
main  effects  (A,B,C),  k  =  3  pure  quadratic  effects  (A2,  B2,  C2),  Q)  =  3  two-way 
interactions  (AB,  AC,  BC),  and  one  3-way  interaction  ( ABC ). 

2.2.2  Central  Composite  Designs  (CCD). 

Box  and  Wilson  (1951)  introduced  a  class  of  response  surface  designs  as  an  alter¬ 
native  to  the  3k  factorial  designs.  The  Central  Composite  Designs  (CCD)  contain  a  2k 
or  2y-p  (see  2.3.1)  design,  axial/star  runs,  and  center  runs  which  are  set  at  the  middle 
of  the  factor  range.  One  reason  for  the  CCD  being  a  popular  class  of  second-order 
designs  is  because  of  the  sequential  nature  in  which  they  can  be  implemented.  Typi¬ 
cally,  if  the  first-order  response  model  associated  with  the  2k  or  2k7p  design  proves  to 
be  a  poor  representation  of  the  system  response,  center  points  are  added  to  provide 
information  on  the  overall  curvature  in  the  system  while  axial  points  are  added  to 
allow  for  the  fitting  of  a  second-order  response  model. 

In  addition  to  the  number  of  runs  associated  with  the  2k  or  2\kp  design,  the 
CCD  contain  2k  runs  per  replication  on  the  axis  of  each  factor  at  a  distance  a  from 
the  center  of  the  design.  As  such,  the  CCD  typically  involve  k  factors  at  5  levels  per 
factor.  The  value  of  a  can  be  chosen  so  the  design  is  rotatable,  meaning  the  prediction 
variance  for  some  point  x  is  the  same  at  all  points  that  are  equidistant  from  the  design 
center.  For  CCD,  the  rotatable  condition  is  satisfied  by  choosing  a  =  f/fif,  where  rif 
is  the  number  of  factorial  runs.  In  other  words,  the  variance  of  predicted  response 
Var[y(x)]  is  constant  on  spheres  (Montgomery,  2013).  However,  it  is  not  necessary  to 


16 


have  exact  rotatability.  By  using  a  =  \/k,  the  CCD  is  not  necessarily  rotatable,  but 
the  loss  in  rotatability  is  negligible  while  producing  a  more  preferable  design  (e.g., 
more  meaningful  design-level  settings  (Myers  and  Anderson-Cook,  2009)). 

Lastly,  the  CCD  contains  nc  center  runs.  The  number  of  center  runs  affects 
the  variance  of  the  predicted  response  Var[i/(x)].  In  the  case  of  spherical  or  near 
spherical  designs,  (a  =  \fk  or  =  ffnf\  having  3  —  5  center  runs  achieves  a  reasonable 
distribution  of  the  scaled  prediction  variance,  SPV(x)  =  ArVar[y(x)]/a2  over  the 
design  region  (Myers  and  Anderson-Cook,  2009). 

2.2.3  Face-Centered  Composite  Designs  (FCD). 

A  variant  of  the  standard  CCD  is  called  the  face-centered  composite  design  (FCD). 
This  design  locates  the  axial  points  in  the  center  of  each  face  in  the  factorial  space 
at  a  distance  a  =  1  from  the  center.  The  face-centered  design  sacrifices  rotatability 
but  is  useful  in  design  situations  that  prevent  larger  axial  distances,  such  as  designs 
near  the  edges  of  performance  envelopes.  Additionally,  as  compared  to  the  CCD,  the 
FCD  requires  only  1  —  2  center  runs  in  order  to  achieve  a  reasonable  distribution  of 
the  scaled  prediction  variance  over  the  design  region. 

2.2.4  Computer  Generated  Designs. 

Two  important  and  useful  concepts  in  statistical  procedures  used  to  assess  exper¬ 
imental  designs  are  optimality  and  robustness.  Whereas  the  robustness  of  a  design 
implies  the  design  is  insensitive  to  assumptions  and/or  models,  optimal  designs  are 
generally  developed  for  a  specific  set  of  assumptions  and/or  models. 

Based  upon  the  empirical  model  selected  to  represent  the  system  response,  avail¬ 
able  sample  size,  design  factor  values,  a  set  of  candidate  points,  and  other  constraints, 
“optimal”  designs  can  be  generated  through  the  use  of  computer  algorithms.  While 
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many  criteria  are  available  with  which  to  generate  designs,  the  criterion  most  often 
used  due  to  its  relatively  simple  computational  nature  is  D-optimality  (Myers  and 
Anderson-Cook,  2009).  However  some  computer  packages  use  a  criterion  based  upon 
good  prediction  capacity  through  examining  scaled  prediction  variance.  For  instance, 
JMP  can  generate  both  D— optimal  and  /—optimal  designs. 

In  contrast  to  algorithms  where  all  possible  sets  of  candidate  points  were  evaluated, 
Meyer  and  Nachtshcim  (1995)  developed  a  coordinate  exchange  algorithm  which  sys¬ 
tematically  searched  individual  design  coordinates  to  find  the  optimal  settings  thereby 
removing  the  candidate  set  of  runs  requirement  (Montgomery,  2013). 

The  term  “optimal”  can  be  misleading  as  it  implies  the  computer  generated  design 
is  the  single  best  design  to  use  in  a  given  situation.  However,  in  truth,  the  “optimal” 
design  is  more  likely  to  be  one  of  a  range  of  designs  which  can  be  used  to  meet  a 
specific  scientific  objective.  Both  a  benefit  and  disadvantage  of  computer  generated 
“optimal”  designs  is  while  a  custom  design  can  be  created  for  any  specified  model 
vice  using  a  standard  design,  the  design  criterion  is  based  upon  the  “correctness”  of 
the  model  matrix.  DuMouchel  and  Jones  (1994)  addressed  the  model-dependency 
problem  by  presenting  “a  Bayesian  modification  of  D— optimality  that  allows  the 
experimenter  to  ‘hedge  bets’  about  an  assumed  model.” 

While  caution  should  be  taken  when  dealing  with  computer  generated  designs 
there  are  times  when  they  are  helpful.  For  instance,  when  there  are  constraints  on 
factor-level  combinations  and  sample  size,  or  unusual  combinations  of  factor  range, 
or  there  is  the  need  to  augment  some  current  design  with  additional  runs  (Myers  and 
Anderson-Cook,  2009). 
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2.3  Reduced  Run  Designs 


As  the  number  of  factors  k  increases  the  run  size  requirement  increases  to  a  point 
which  make  full  factorial  designs  sometimes  impractical  and  inefficient.  The  sparsity 
of  effects  principle  states  the  effects  on  a  system  or  response  of  interest  attributable 
to  most  high-order  interactions  are  negligible  when  compared  to  some  of  the  main 
effects  and  low-order  interactions  (Montgomery,  2013).  For  example,  a  full  2'  design 
requires  128  runs  for  estimating  127  main  effects  and  interactions  but  sparsity  of 
effects  means  only  a  subset  of  the  7  main  effects  and  21  two-way  interactions  are 
likely  significant.  As  such,  only  a  fraction  of  the  complete  2‘  runs  are  required  to 
obtain  estimates  on  significant  effects.  As  a  result,  reduced  run  designs  have  been 
developed  to  be  more  efficient  in  terms  of  design  size. 

2.3.1  2k~p  and  3k~p  Fractional  Factorial  Designs  (FFD). 

The  2k~p  Fractional  Factorial  Design  is  comprised  of  a  subset  of  the  runs  of  the 
2k  Factorial  Design.  Similar  to  the  2k  Factorial  Design,  the  2k~p  Fractional  Factorial 
Designs  consists  of  k  factors  each  at  only  two  levels.  The  value  of  p  specifies  the 
degree  to  which  the  design  is  fractionated,  determined  by  1/2P. 


Table  3.  A  27  4  Fractional  Factorial  Design,  Principle  Fraction 


Run 

A 

B 

C  D=AB  E=AC  F 

=  BC  G: 

=ABC 

1 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

2 

+ 

+ 

— 

+ 

— 

— 

— 

3 

+ 

— 

+ 

— 

+ 

— 

— 

4 

+ 

— 

— 

— 

— 

+ 

+ 

5 

— 

+ 

+ 

— 

— 

+ 

— 

6 

— 

+ 

— 

— 

+ 

— 

+ 

7 

— 

— 

+ 

+ 

— 

— 

+ 

8 

— 

— 

— 

+ 

+ 

+ 

— 
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For  instance,  a  21'4  design  (See  Table  3)  is  a  1/24  =  \/lQth  fraction  of  the  2' 
design.  As  such,  the  2'~4  design  contains  8  runs  or  l/16</l  of  the  128  runs  for  the  2' 
design.  A  key  issue  is  how  should  the  fractional  design  be  selected. 

Generally,  the  first  k—p  independent  columns  are  generated  by  the  runs  in  the  2k~p 
design.  In  the  2'~4  design,  the  first  3  columns  are  generated  by  the  runs  associated 
with  the  23  full  factorial  design.  The  remaining  p  columns  can  be  generated  as 
interactions  of  the  first  k  —  p  columns  (Wu  and  Hamada,  2011).  While  these  p 
columns  are  dependent  upon  the  first  k—p  columns,  they  are  independent  of  each 
other.  As  such,  the  value  of  p  determines  the  required  number  of  independent  design 
generators.  Because  the  design  generators  were  determined  by  column  interactions, 
the  p  factor  effect  estimates  are  aliased,  meaning  the  factor  effects  on  the  system 
response  can  not  be  estimated  separately  from  factor  interactions. 

For  the  2'"4  design  in  Table  3,  the  p  =  4  design  generators  are  D  =  AB ,  E  =  AC, 
F  =  BC ,  and  G  =  ABC.  Since  D  =  AB ,  the  estimate  of  the  effect  of  factor  D  on 
the  response  is  affected  by  the  effects  of  A  and  B.  The  degree  to  which  the  effects  are 
aliased  is  given  by  the  design  resolution.  The  27”4  design  in  Table  3  is  of  Resolution 
III  (2J7/)  because  main  effects  ( D )  are  aliased  with  two-way  interactions  (AB).  The 
technique  used  to  generate  the  design  in  Table  3  will  provide  the  “principle”  fraction 
of  a  complete  2P  family  of  fractions.  In  practice  any  of  the  remaining  2P  —  1  fractions 
may  be  used,  each  having  the  same  design  resolution. 

While  the  design  generators  identify  some  of  the  alias  structure,  the  complete 
design  alias  structure  is  determined  by  the  complete  defining  relation  for  the  design 
obtained  by  adding  all  combinations  of  the  design  generators.  The  defining  relation 
is  comprised  of  the  p  =  4  design  generators  and  their  2P  —  p  —  1  =  11  interactions 
(Montgomery,  2013).  While  2y-p  designs  would  be  more  desirable  because  of  their 
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aliasing  structure,  2kyP  and  2kJp  designs  are  most  commonly  used  for  screening  due 
to  more  economical  run  sizes. 

When  a  large  number  of  factors  are  being  considered,  the  3k  factorial  can  be 
excessively  large,  even  more  so  than  for  the  2k  factorial.  However,  similar  to  the 
2k  factorial,  under  sparsity  of  effect,  fractional  designs  can  be  considered  which  still 
provide  sufficient  information  for  significant  effect  estimations. 

The  3k~p  Fractional  Factorial  Designs  consists  of  k  factors  each  at  three  levels. 
The  value  of  p  again  specifies  the  degree  to  which  the  design  is  fractionated,  deter¬ 
mined  by  1/3P.  For  instance,  a  3k~2  design  is  a  1/9 th  fraction,  while  3fc-3  design  is 
a  l/27th  fraction.  A  general  procedure  for  constructing  a  3fc_p  fractional  factorial 
design  is  given  by  Montgomery  (2013).  Connor  and  Zelcn  (1959)  and  Xu  (2005) 
provide  an  extensive  list  of  3k~p  designs.  Unfortunately,  especially  as  compared  to 
2k~p  designs,  the  aliasing  structure  for  3k~p  designs  is  very  complex  especially  as  the 
level  of  fractioning  increases.  If  effect  interactions  are  not  negligible,  design  results 
can  be  difficult,  even  nearly  impossible  to  interpret  because  of  the  partial  aliasing  of 
two-degree-of- freedom  components  (Montgomery,  2013). 

Regular  designs  are  2k~p  and  3k~p  designs  constructed  through  defining  relations 
among  its  factors.  Nonregular  designs  lack  such  a  defining  relation.  Two-level  nonreg¬ 
ular  designs  often  used  for  factor  screening  are  Plackett-Burman  and  Supersaturated 
designs  (Cheng  and  Wu,  2001).  Three- level  nonregular  response  surface  design  are 
Box-Behnken  designs. 

2.3.2  Plackett-Burman  Designs  (PB). 

Plackett  and  Burman  (1946)  developed  nonregular  two-level  fractional  factorial 
designs  which  can  study  k  —  N  —  1  variables  in  N  runs,  where  N  is  a  multiple  of  4.  If 
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N  =  2l  for  i  >  2  ,  PB  designs  are  synonymous  with  2k  factorial  designs.  An  example 
design  is  presented  in  Table  4  where  N  =  12  runs  for  k  =  11  factors. 


Table  4.  A  12-run  Plackett-Burman  Design  for  11  factors 


Run 

A 

B 

C 

D 

E 

F 

G 

H 

I 

J 

K 

1 

+ 

— 

+ 

— 

— 

— 

+ 

+ 

+ 

— 

+ 

2 

+ 

+ 

— 

+ 

— 

— 

— 

+ 

+ 

+ 

— 

3 

— 

+ 

+ 

— 

+ 

— 

— 

— 

+ 

+ 

+ 

4 

+ 

— 

+ 

+ 

— 

+ 

— 

— 

— 

+ 

+ 

5 

+ 

+ 

— 

+ 

+ 

— 

+ 

— 

— 

— 

+ 

6 

+ 

+ 

+ 

— 

+ 

+ 

— 

+ 

— 

— 

— 

7 

— 

+ 

+ 

+ 

— 

+ 

+ 

— 

+ 

— 

— 

8 

— 

— 

+ 

+ 

+ 

— 

+ 

+ 

— 

+ 

— 

9 

— 

— 

— 

+ 

+ 

+ 

— 

+ 

+ 

— 

+ 

10 

+ 

— 

— 

— 

+ 

+ 

+ 

— 

+ 

+ 

— 

11 

— 

+ 

— 

— 

— 

+ 

+ 

+ 

— 

+ 

+ 

12 

The  nonregular  Plackett-Burman  designs  sacrifice  a  simple  alias  structure  for  bet¬ 
ter  run  economy  and  projectivity  when  compared  to  regular  2k~p  designs.  A  2kIff  has 
projectivity  2,  meaning  it  will  collapse  into  a  22  factorial  in  a  subset  of  any  two  of 
the  original  k  factors,  while  PBkY^~l  have  projectivity  3  or  4  depending  upon  the 
design  size.  For  instance,  the  Table  4  design  will  project  into  a  full  23  factorial  from 
11  factors  in  12  runs  while  a  comparable  2)]^'  design  will  only  project  into  a  full  22 
factorial  from  11  factors  in  16  runs. 

Unfortunately,  PB  designs  have  complex  alias  structures.  In  the  Table  4  design, 
each  main  effect  is  partially  aliased  with  45  two-factor  interactions  while  each  main 
effect  in  a  2 )]j‘  design  is  completely  aliased  with  at  most  4  two-factor  interactions 
(Montgomery,  2013).  Due  to  the  complex  aliasing  structure,  analysis  of  PB  designs 
can  become  difficult.  Hamada  and  Wu  (1992)  discuss  methods  for  analyzing  designs 
with  complex  aliasing  based  upon  the  sparsity  of  effect  and  effect  heredity  principles. 

The  effects  heredity  principle  states  if  an  interaction  is  significant  the  components 
of  the  interaction  are  significant.  Under  strong  heredity  all  main  effects  within  a 
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significant  interaction  are  themselves  significant;  however,  under  weak  heredity  not 
all  the  main  effects  are  significant.  In  combination  with  effect  sparsity,  effect  heredity 
would  be  concerned  with  only  significant  two-factor  interactions.  Thus  if  AB  is 
significant,  then  under  strong  heredity,  A  and  B  would  be  significant  while  under 
weak  heredity  only  A  or  B  would  be  significant  (Montgomery,  2013). 

2.3.3  Box-Behnken  Designs  (BBD). 

Box  and  Behnken  (1960)  developed  a  family  of  17  efficient  rotatable/near-rotatable 
spherical  three-level  designs  suitable  for  fitting  second-order  (quadratic)  response 
models.  The  BBD  are  formed  by  combining  two-level  factorials  with  balanced  incom¬ 
plete  block  designs  (BIBD)  or  partially  balanced  incomplete  block  designs  (PBIBD). 
In  contrast  to  the  CCD,  the  Box-Behnken  design  does  not  contain  any  points  at 
the  vertices  or  face-center  of  the  design  but  rather  at  the  center  of  the  edges  of  the 
process  space.  As  a  result,  the  Box-Behnken  designs  avoid  extreme  values  for  factor- 
level  combinations  which  may  be  impossible  to  test  due  to  cost  or  physical  process 
constraints  (Montgomery,  2013). 

Of  the  original  17  designs  proposed  by  Box  and  Behnken,  10  were  constructed  from 
BIBDs  while  7  were  constructed  from  PBIBDs.  BIBD  are  incomplete  block  designs 
where  each  factor  appears  an  equal  number  of  times  with  every  other  factor  while 
with  PBIBD  each  factor  does  not  appear  an  equal  number  of  times.  The  BBD  are 
formed  by  varying  p  parameters  in  a  full  factorial  manner  while  the  remaining  k  —  p 
parameters  are  kept  steady  at  the  center  factor  level  setting.  For  k  =  3  —  5  and  6  —  7, 
p  =  2  and  3,  respectively  for  the  BBD  designs.  Additionally,  the  BBD  uses  three 
to  five  center  runs  to  avoid  singularity  in  the  design  matrix  for  k  =  4  and  7  and 
to  maintain  favorable  design  qualities  like  a  reasonable  Var[i/(x)]  distribution  (Myers 
and  Anderson- Cook,  2009). 
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Overall,  the  design  run  requirements  for  both  the  BBD  and  CCD  are  comparable. 
For  k  =  3  and  5,  the  CCD  with  the  full  two-level  factorial  requires  two  more  runs, 
not  including  center  runs,  than  the  BBD  while  for  k  =  4,  the  CCD  and  BBD  require 
an  equal  number  of  runs.  As  a  result,  the  benefit  of  employing  a  BBD  design  over  a 
CCD  is  not  necessarily  due  to  run  efficiency  but  rather  the  factor  level  combination 
location  in  the  design  space. 

Through  the  years  some  of  the  original  BBD  have  been  improved  upon  in  terms  of 
rotatability,  average  prediction  variance,  D—  and  G— efficiency  (Nguyen  and  Borkowski, 
2008).  In  addition,  new  Box-Behnken  type  designs  with  larger  k  (Mee,  2000)  and  dif¬ 
fering  orthogonally  blocked  solutions  (Nguyen  and  Borkowski,  2008)  than  the  original 
BBD  have  been  proposed.  More  recently  small  Box-Behnken  Designs  (SBBD)  have 
been  proposed  which  reduce  the  run  size  requirement  of  the  original  BBD  by  replac¬ 
ing  the  full  2k  factorial  designs  within  the  balanced  incomplete  block  designs  (BIBD) 
or  partially  balanced  incomplete  block  designs  (PBIBD)  partly  by  2^Jj  designs  and 
partly  by  full  factorial  designs  (Zhang  et  ah,  2011).  When  compared  to  the  original 
BBD,  the  SBBD  possess  smaller  D— efficiency  values  but  the  values  are  still  relatively 
high  (>  70%)  for  k  <  11  while  requiring  fewer  runs. 

2.3.4  Other  Reduced  Run  Designs. 

Oehlert  and  Whitcomb  (2002)  proposed  a  class  of  equireplicated  irregular  fractions 
of  2k  factorials  with  resolution  V  where  equireplicated  means  each  factor  occurs  an 
equal  number  of  times  at  their  high  and  low  levels.  These  designs,  called  Minimum- 
Run  Res  V  Designs,  are  constructed  using  the  Li  and  Wu  (1997)  columnwise-pairwise 
algorithm  to  optimize  the  D— optimal  criteron.  These  may  be  used  on  their  own  if 
interested  in  a  first-order  response  model  or  used  as  the  factorial  component  of  the 
CCD  for  a  second-order  response  model  estimation. 
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Morris  (2000)  proposed  a  method  for  constructing  three-level  designs,  called  aug¬ 
mented  pairs  designs,  suitable  for  fitting  second-order  response  models  within  a 
cuboidal  region  of  interest.  Starting  with  an  initial  two-level  first-order  design,  the 
third  level  of  each  factor  is  determined  by  a  linear  combination  of  the  levels  of  ev¬ 
ery  pair  of  points.  In  comparison  to  BBD  and  CCD,  Morris  (2000)  showed  that 
the  precision  of  model  parameter  and  expected  response  estimates  are  favorable  and 
requires  fewer  runs  (Myers  et  ah,  2004).  In  comparison  to  small  composite  designs, 
augmented  pair  designs  show  better  model  parameter  and  expected  response  estimate 
but  do  require  more  runs. 

Gilmour  (2006)  introduced  a  class  of  three-level  designs  made  up  of  subsets  of  3k 
factorial  designs  befittingly  known  as  subset  designs.  Letting  Sr  represent  the  rth 
orbit,  subset  designs  have  the  form  c0S0  +  CiSi  +  •  •  •  +  CkSk,  where  cr,  r  =  1, . . . ,  k, 
is  the  number  of  replicates  of  the  points  in  Sr  and  an  orbit  is  comprised  of  a  subset 
of  points  of  the  3k  factorial  design  on  the  hypersphere  of  radius  about  the  center 
point,  S°.  As  such  each  subset  design  may  contain  points  from  any  number  of  orbits 
with  each  subset  Sr  containing  (Ji)2r  points  consisting  of  a  2r  factorial  design  at  levels 
±1  for  each  combination  of  r  factors  and  with  the  remaining  q  —  r  factors  at  0.  In 
order  for  the  subset  design  to  be  capable  of  fitting  a  second-order  response,  Gilmour 
(2006)  stipulated  two  requirements: 

•  cr  >  0  for  at  least  two  r  and  cr  >  0  for  at  least  one  r  with  1  <  r  <  q  —  Iso  that 
all  quadratic  parameters  can  be  estimated 

•  cr  >  0  for  at  least  one  r  >  2  so  that  all  interactions  can  be  estimated. 

Additionally,  Gilmour  (2006)  specified  fractional  subset  and  incomplete  subset  re¬ 
duced  run  designs  where  fractional  subset  designs  replace  all  the  2r  factorials  in  at 
least  one  Sr  by  a  fractional  factorial  and  incomplete  subset  designs  use  a  reduced 
number  of  the  (£)  factorial  sets  of  r  factors. 
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2.4  Saturated/Near-Saturated  Designs 


Recall  a  second-order  model  contains  p  —  1  +  2 k  +  k(k  —  l)/2  terms.  A  k  =  4 
factor  BBD  design  lias  15  terms  to  estimate  but  the  design  itself  contains  24  points 
plus  center  runs.  While  reduced  run  designs  like  the  CCD  and  the  BBD  provide  more 
efficient  designs  than  the  full  model  estimable  designs,  these  designs  still  can  possess 
far  more  design  points  than  needed  to  estimate  the  second-order  response  effects. 
As  a  result,  the  class  of  saturated  or  near-saturated  designs  have  been  developed. 
Saturated  or  near-saturated  designs  are  designs  such  that  the  number  of  design  points 
are  equal  to  or  near,  but  not  less  than,  the  number  of  terms  in  the  design  model. 

2.4.1  Small  Composite  Designs  (SCD). 

In  contrast  to  the  CCD  and  FCD,  which  contain  a  2k  or  2I)~P  factorial  design, 
Hartley  (1959)  suggested  replacing  the  factorial  design  with  a  special  resolution  III 
factorial  design,  denoted  III*  where  two-factor  interactions  are  not  aliased  with  other 
two-factor  interactions.  As  a  result,  the  number  of  design  runs  is  decreased  resulting 
in  Small  Composite  Designs  (SCD).  The  SCD  sacrifices  good  prediction  variance 
properties  with  the  reduction  in  run  size  because  main  effects  could  be  aliased  with 
two-factor  interactions.  However,  the  SCD  design  still  allows  for  the  estimation  of  all 
main-effects  because  the  star  portion  of  the  design  provides  additional  information. 
While  Hartley  (1959)  suggested  replacing  the  2k  or  2\lp  factorial  with  a  III*  factorial, 
additional  work  included  using  irregular  2k  fractions  (Westlake,  1965)  and  columns  of 
Plackett-Burman  designs  (Draper,  1985).  Draper  and  Lin  (1990)  improved  upon  the 
previous  design  work  associated  with  modifying  the  composite  structure  of  the  SCD 
by  adding  a  k  =  10  design  and  reducing  the  run  size  on  previous  designs  by  deleting 
repeat  runs  (see  Table  5).  While  deleting  repeat  runs  reduced  the  design  size,  it  also 
reduces  the  amount  of  information  available  to  estimate  pure  error. 
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Table  5.  Cube  Points  in  Some  Small  Composite  Designs  (Draper  and  Lin,  1990) 


3 

4 

5 

6 

7 

8 

9 

10 

Coefficients 
p  =  (k  +  1  ){k  +  2)/2 

10 

15 

21 

28 

36 

45 

55 

66 

Star  points  2 k 

6 

8 

10 

12 

14 

16 

18 

20 

Minimal  points  in  cube 

4 

7 

11 

16 

22 

29 

37 

46 

Box  and  Hunter  (1957) 

8 

16 

16 

32 

64 

64 

128 

128 

(23) 

(24) 

uro 

(2U1) 

(2y_1) 

un 

(29U2) 

(2(?-3) 

Hartley  (1959) 

4 

8 

- 

16 

32 

- 

64 

- 

(2)7/0 

(2)7/0 

- 

(2/Zr*) 

(2)7/0 

- 

(2/7/0 

- 

Westlake  (1965) 

- 

- 

12 

- 

26 

- 

44 

- 

- 

- 

(3/8W26) 

- 

(13/64X27) 

- 

(11/128X29) 

- 

Draper  (1985) 

- 

- 

12 

- 

28 

- 

44 

Draper  and  Lin  (1990) 

4 

8 

12 

16 

24 

36 

40 

48 

after  elimination  of  repeat 

4 

8 

11 

16 

22 

30 

38 

46 

2.4.2  Rechtschaffner  Designs. 

Rechtschaffner  (1967)  presented  a  class  of  saturated  second-order  designs  for  k 
factors  in  a  cuboidal  region  of  interest  based  upon  4  different  design  generators  shown 
in  Table  6.  Each  design  generator  is  identified  with  the  different  terms  of  the  second- 
order  model. 

Table  6.  Design  Generators  for  Saturated  Fractions  of  3"  Factorial  Design 


Number 

Design  Generator 

I 

(-1,-  •  • ,  -1)  for  all  n 

II 

(-1,  1,  •  •  • ,  1)  for  all  n 

III 

(-1,  -1,  1)  for  n  =  3 

(1,  1,-1,  •  •  • ,  -1)  for  n  >  3 

IV 

(1,  0,  •  •  • ,  0)  for  all  n 

For  instance,  Design  Generator  I  identified  with  the  intercept  term  while  Design 
Generators  II  and  III  are  identified  with  the  main  effect  and  two-way  interaction 
effects,  respectively.  Treatment  combinations  are  obtained  by  permuting  the  elements 
of  each  design  generator  to  reach  the  desired  saturated  fraction,  see  Table  7.  While  the 
designs  are  not  based  upon  D— optimality  criterion,  the  signs  of  the  design  generators 
can  be  varied  in  order  to  get  higher  D  values.  Unfortunately,  while  Rechtschaffner 
designs  are  available  for  any  k,  they  should  be  limited  to  small  values  of  k  because 
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as  k  grows  the  designs  can  be  shown  to  have  an  asymptotic  D— efficiency  of  0  with 
respect  to  the  class  of  saturated  designs  (Notz,  1982). 


Table  7.  Saturated  Fraction  of  a  35  Factorial  Design 


Run 

Design  Generator  A 

B 

c 

D 

E 

1 

(-1, -i,-i)  -i 

-1 

-1 

-1 

-1 

2 

(-1,1,1,  1,1)  -1 

1 

1 

1 

1 

3 

1 

-1 

1 

1 

1 

4 

1 

1 

-1 

1 

1 

5 

1 

1 

1 

-1 

1 

6 

1 

1 

1 

1 

-1 

7 

(i,  i,-i, -i,-i)  i 

1 

-1 

-1 

-1 

8 

1 

-1 

1 

-1 

-1 

9 

1 

-1 

-1 

1 

-1 

10 

1 

-1 

-1 

-1 

1 

11 

-1 

1 

1 

-1 

-1 

12 

-1 

1 

-1 

1 

-1 

13 

-1 

1 

-1 

-1 

1 

14 

-1 

-1 

1 

1 

-1 

15 

-1 

-1 

1 

-1 

1 

16 

-1 

-1 

-1 

1 

1 

17 

(1,  0,  0,  0,  0)  1 

0 

0 

0 

0 

18 

0 

1 

0 

0 

0 

19 

0 

0 

1 

0 

0 

20 

0 

0 

0 

1 

0 

21 

0 

0 

0 

0 

1 

2.4.3  Box-Draper  Designs. 

Box  and  Draper  (1974)  presented  a  class  of  saturated  second-order  designs  for 
k  factors  in  a  cuboidal  region  of  interest  based  on  D— optimality.  Although  the 
designs  are  optimal  for  k  =  2  and  3,  they  are  not  optimal  for  k  >  4.  Dubova  and 
Federov  (1972)  found  a  better  design  for  k  =  4  and  Notz  (1982)  found  better  designs 
for  k  =  5  or  6  by  presenting  alternative  designs  higher  D— optimal  criterion  values. 
Additionally,  while  better  designs  for  k  >  7  have  not  been  identified,  Box  and  Draper 
(1974)  designs  were  proved  not  optimal  via  an  existence  result.  Therefore,  the  Box  and 


Draper  designs  are  minimal  D— optimal  designs  for  k  =  2  and  3,  and  are  “minimal 
designs  of  a  simple  form  for  any  k"  for  a  cuboidal  region  of  interest.  Similar  to  the 
Rechtschaffner  designs,  the  Box  and  Draper  designs  are  available  for  any  k ,  however 
they  too  should  be  limited  to  small  values  of  k  because  as  k  grows  the  designs  can  be 
shown  to  have  an  asymptotic  D— efficiency  of  0  with  respect  to  the  class  of  saturated 
designs  (Notz,  1982). 

2.4.4  Hybrid  Designs. 

Roquemore  (1976)  presented  a  set  of  saturated  or  near-saturated  second-order 
designs  for  k  =  3  to  6  factors  which  are  rotatable  or  near-rotatable  while  achieving 
the  same  degree  of  orthogonality  as  a  CCD.  The  hybrid  designs  for  k  variables  is 
constructed  by  first  augmenting  a  k  —  1  variable  central  composite  design  with  an 
additional  column  for  variable  k.  The  design  is  then  augmented  with  additional  runs 
for  variable  k  at  different  levels  to  create  desirable  design  properties.  Table  8  shows 
the  design  matrix  for  hybrid  310.  In  this  instance,  k  =  3,  so  the  hybrid  design  contains 
a  k  =  2  CCD  augmented  with  a  third  column.  These  designs  do  suffer  from  having 
odd  factor  level  settings.  For  instance,  none  of  the  10  factor  level  settings  for  C  in 
Table  8  are  set  to  the  typical  values  of  0  or  ±1. 

Table  8.  Hybrid  Design  310:  k  =  3  and  n  =  10 


Run 

A 

B 

C 

1 

0 

0 

1.2906 

2 

0 

0 

-0.1360 

3 

-1 

-1 

0.6386 

4 

1 

-1 

0.6386 

5 

-1 

1 

0.6386 

6 

1 

1 

0.6386 

7 

1.736 

0 

-0.9273 

8 

-1.736 

0 

-0.9273 

9 

0 

1.736 

-0.9273 

10 

0 

-1.736 

-0.9273 
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2.4.5  Hoke  Designs. 


Hoke  (1974)  presented  a  class  of  second-order  designs  for  k  =  3  to  6  factors  at 
3  levels  based  on  saturated  and  near-saturated  irregular  fractions  of  the  3fc  factorial. 
For  each  number  of  factors  k,  seven  versions  of  the  Hoke  designs  exist,  denoted  D1; 
D2,  . . .,  D7,  consisting  of  a  mixture  of  factorial,  axial,  and  edge  points  making  the 
Hoke  designs  suitable  for  a  cuboidal  region  of  interest  (Myers  and  Anderson-Cook, 
2009).  Tables  9  and  10  show  the  design  matrices  for  two  versions,  one  saturated  D2 
and  one  near-saturated  D6,  of  Hoke  designs  for  k  =  3. 

Table  9.  Hoke  Design  D2:  k  =  3  and  n  =  10 


Run 

A 

B 

c 

1 

-1 

-1 

-1 

2 

1 

1 

-1 

3 

1 

-1 

1 

4 

-1 

1 

1 

5 

1 

-1 

-1 

6 

-1 

1 

-1 

7 

-1 

-1 

1 

8 

-1 

0 

0 

9 

0 

-1 

0 

10 

0 

0 

-1 

Hoke  compared  his  designs  with  Box-Behnken  and  SCD  designs  of  comparable 
size  based  upon  the  tr^X'X)-1  (A— optimality)  and  |X'X|  (D— optimality)  criteria 
and  concluded  that  his  designs  compared  favorably  (Khuri  and  Cornell,  1996). 

2.4.6  Other  Minimal  Run  Designs. 

Angelopoulos  et  al.  (2009)  presented  a  class  of  balanced,  near  rotatable  second- 
order  designs  which  minimized  the  number  of  factorial  runs  associated  with  a  CCD 
suitable  for  a  spherical  region  of  interest.  Their  designs  were  determined  by  search¬ 
ing  through  designs  with  0(mod2)  factorial  runs  (i.e. ,  keep  an  even  number  of  runs) 
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Table  10.  Hoke  Design  D6:  k  =  3  and  n  =  13 


Run 

A 

B 

c 

1 

-1 

-1 

-1 

2 

1 

1 

-1 

3 

1 

-1 

1 

4 

-1 

1 

1 

5 

1 

-1 

-1 

6 

-1 

1 

-1 

7 

-1 

-1 

1 

8 

-1 

0 

0 

9 

0 

-1 

0 

10 

0 

0 

-1 

11 

1 

1 

0 

12 

1 

0 

1 

13 

0 

1 

1 

associated  with  a  CCD  for  k  factors  and  selecting  the  design  with  the  lowest  possi¬ 
ble  correlation  among  main  effects.  In  order  to  discriminate  between  near  rotatable 
CCDs,  the  Draper-Pukelshcim  measure  Q*  was  applied  (Angelopoulos  et  ah,  2009). 
Unfortunately,  it  was  determined  the  a- value  for  the  CCD  axial  runs  which  pro¬ 
vided  the  maximum  Q*  value,  did  so  at  the  expense  of  efficiently  estimating  all  the 
parameters  for  a  second-order  model. 


2.5  Supersaturated  Designs  (SSD) 

Supersaturated  designs  are  a  type  of  fractional  factorial  design  where  the  number 
of  factors  k  under  investigation  exceeds  the  number  of  available  experimental  runs 
N.  Since  k  >  N  —  1,  the  degrees  of  freedom  within  the  design  are  insufficient  to 
estimate  all  the  main  effects  and  the  design  matrix  cannot  be  orthogonal.  Therefore 
in  order  for  supersaturated  designs  to  useful  as  screening  designs  only  a  few  factors 
can  be  active.  As  such  supersaturated  designs  are  generally  used  when  the  number  of 
potential  factors  is  large  but  few  are  believed  to  have  actual  effects  (effect  sparsity) 
and  either  budget  or  time  constraints  limit  the  number  of  experimental  runs. 
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Since  Satterthwaite  (1959)  first  introduced  the  supersaturated  design  as  a  random 
balanced  design,  research  has  focused  in  primarily  three  areas:  design  construction 
methods,  development  of  criterion  to  assess  supersaturated  designs,  and  data  analysis 
methods  used  to  identify  the  important  effects. 

Booth  and  Cox  (1962)  provided  seven  supersaturated  designs  for  two-level  factors 
created  via  computer  search  using  the  E(s2)  criterion,  which  measures  the  average  cor¬ 
relation  between  design  columns.  Overall,  a  general  design  construction  method  did 
not  exist  until  Lin  (1993)  developed  a  method  based  upon  half  fractions  of  Hadamard 
matrices.  Subsequently,  Lin  (1995),  Nguyen  (1996)  and  Li  and  Wu  (1997)  have  pro¬ 
posed  methods  based  upon  the  E(s2)  criterion  while  Jones  et  al.  (2008)  constructed 
designs  using  Bayesian  D— optimality. 

Yamada  and  Lin  (1999)  and  Yamada  et  al.  (1999)  were  the  first  to  discuss  and  pro¬ 
vide  construction  methods  for  three-level  supersaturated  designs.  Fang  et  al.  (2000) 
first  addressed  the  construction  of  multi-level  supersaturated  designs.  More  recently, 
Yamada  et  al.  (2006)  detailed  a  general  construction  method  for  mixed-level  super¬ 
saturated  designs.  Overall  beyond  construction  methods,  little  else  has  been  done 
with  three-level  SSD. 

Closely  related  to  the  manner  and  method  of  which  supersaturated  designs  are 
created  is  the  evaluation  criteria  used  to  differentiate  amongst  these  design  methods. 
E(s2)  optimality  is  still  the  most  widely  used  criterion  for  selection  of  supersaturated 
designs,  but  a  Bayesian  D-optimality  criteria  has  also  been  used.  Beattie  et  al.  (2002) 
detailed  an  alternative  two-stage  Bayesian  model  selection  strategy  by  combining  a 
stochastic  search  variable  selection  method  and  an  intrinsic  Bayes  factor  method. 

Similar  to  the  number  of  ways  to  assess  design  quality,  there  are  many  ways  of 
analyzing  the  data  recorded  during  the  experimental  runs  to  identify  the  important 
effects/factors.  Three  methods  include  stepwise  selection  procedures,  the  Gauss- 
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Dantzig  selector,  and  model  averaging.  Candes  and  Tao  (2007)  developed  the  Gauss- 
Dantzig  selector  which  was  further  expanded  upon  by  Phoa  et  ah  (2009)  who  proposed 
a  graphical  procedure  and  an  automatic  variable  selection  method  to  accompany  the 
Dantzig  selector.  Marley  and  Woods  (2010)  evaluated  the  use  of  E(s2)— optimal  and 
Bayesian  D— Optimal  designs  and  the  three  analysis  strategies  through  the  use  of 
simulation-based  experiments. 

Overall,  SSDs  offer  a  method  of  greatly  reducing  the  amount  of  experimentation 
needed  to  screen  important  factors.  However,  they  should  not  be  used  without  a 
clear  understanding  of  the  risks  involved.  SSDs  are  nonregular  factorial  designs,  in 
which,  orthogonality  is  not  obtainable.  Most  existing  criteria  for  SSDs  measure  the 
non-orthogonality  combinatorially  between  two  factors.  Some  care  must  be  exercised 
in  the  selection  of  a  design,  however,  as  a  supersaturated  design  that  departs  consider¬ 
ably  from  an  orthogonal  design  could  produce  misleading  results.  This  can  especially 
occur  if  the  departure  from  orthogonality  is  more  than  slight.  E(s2)  gives  an  intuitive 
measure  of  nonorthogonality  where  smaller  is  better.  In  particular,  stepwise  regression 
is  one  method  that  has  been  used  for  identifying  the  effects  that  should  be  estimated, 
but  stepwise  regression  can  easily  fail  to  make  the  appropriate  determination  when 
the  correlation  between  the  columns  in  the  design  are  not  quite  small. 

2.6  Second-Order  Screening  Designs 

In  contrast  with  the  traditional  sequential  design  approach  of  response  surface 
methodology  (RSM),  recent  literature  has  proposed  employing  a  single  experimental 
design  capable  of  preforming  both  factor  screening  and  response  surface  exploration 
when  conducting  multiple  experiments  is  unrealistic  due  to  time,  budget,  or  other 
constraints.  For  instance,  in  agricultural  settings  the  time  duration  of  the  design  can 
be  exceedingly  long.  Also  within  a  manufacturing  setting  experimental  preparation 
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can  be  overly  time-consuming.  Directly  applicable  to  the  DOD,  Lawson  (2003)  points 
out  fixed  deadlines  for  scale  up  and  production  of  prototype  engineering  designs  may 
not  allow  the  possibility  of  follow-up  experimentation. 

Two  important  principles  used  in  developing  successful  screening  designs  are  spar¬ 
sity  and  heredity.  The  sparsity  principle  stems  from  the  Pareto  principle  which  states 
that  most  of  the  variability  in  a  system  or  process  output  is  due  to  a  small  number  of 
inputs.  Traditionally,  factor  sparsity  has  led  to  the  assumption  in  screening  designs 
that  only  a  small  number  of  factors  are  present  among  the  actual  model  terms,  while 
effect  sparsity  indicates  that  the  number  of  active  effects  compared  to  active  factors 
is  relatively  small.  Therefore,  it  is  possible  for  the  effect  sparsity  assumption  to  hold 
while  factor  sparsity  does  not. 

Heredity,  either  strong  or  weak,  is  the  second  screening  principle  commonly  used 
when  considering  model  selection.  Strong  heredity  implies  that  if  a  model  includes 
a  two-factor  interaction,  then  its  constituent  main  effects  are  included  in  the  model. 
Conversely,  weak  heredity  requires  only  one  of  the  two  constituent  main  effects  be 
included  in  the  model. 

Initial  attempts  to  use  response  surface  designs  capable  of  performing  both  fac¬ 
tor  screening  and  response  surface  exploration  with  a  single  design  relied  upon  the 
design’s  projection  capacity. 

Cheng  and  Wu  (2001),  hereafter  referred  to  as  CW,  introduced  a  two-stage  analysis 
method  where  the  first  stage  consisted  of  performing  factor  screening  analysis  to 
identify  important  factors  and  the  second  stage  involved  fitting  a  second-order  model 
by  assuming  both  factor  sparsity  and  strong  effect  heredity  held  and  the  region  chosen 
for  factor  screening  contained  the  optimal  response  surface  area. 

For  the  first  stage,  CW  recommended  a  main  effect  analysis  method  for  simplicity. 
The  key  linkage  between  stage  one  and  two  was  the  ability  to  project  the  initial  larger 
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factor  space  onto  a  smaller  factor  space  capable  of  fitting  a  second-order  model.  When 
the  factor  sparsity  principle  holds,  any  regular  fractional  factorial  design  of  resolution 
R  projects  onto  any  subset  of  R  —  1  factors  as  a  full  factorial.  For  example,  a  2 
design  (R  =  3)  can  project  into  a  22  design  in  every  subset  of  two  factors  (Myers  and 
Anderson-Cook,  2009).  This  projection  property  extends  to  nonregular  designs  like 
Plackett-Burman  designs  by  Lin  and  Draper  (1992)  and  Wang  and  Wu  (1995). 

Because  a  design  can  project  onto  many  different  combinations  of  factors,  a 
projection-efficiency  criterion  was  developed  to  compare  orthogonal  designs  based 
upon  (1)  the  number  of  eligible  projected  designs  with  lower-dimension  projections 
being  more  important  than  higher- dimension  projections  and  (2)  the  estimation  effi¬ 
ciency  for  eligible  projected  designs  determined  by  the  ratio  of  each  designs  D—  and 
G— efficiences  (Cheng  and  Wu,  2001).  Eligible  designs  are  designs  which  can  fit  a 
second-order  model  and  the  D—  and  G— efficiences,  denoted  Deff  and  Ge//  respec¬ 
tively,  criteria  compare  the  performance  of  a  design  against  a  corresponding  optimal 
design  (Myers  and  Anderson-Cook,  2009). 

CW  studied  three  orthogonal  array  (OA)  designs  (OA(18,37),  OA( 27,  3s) ,  and 
OA(36,312))  which  demonstrated  desirable  projection  properties.  The  OA(N,  3k) 
connotation  shows  the  design’s  number  of  runs  N  and  number  of  factors  k.  In  contrast 
to  3n~fc  designs  which  have  defining  contrast  subgroups  to  describe  the  design  struc¬ 
ture,  the  OA(N,  3k )  designs  studied  by  CW  required  computer  search  to  classify  the 
possible  projected  designs.  Fortunately,  while  more  complex,  the  overall  projection 
properties  are  better  and  generally  required  fewer  runs.  When  compared  to  CCDs, 
the  OA(N,  3k )  designs  studied  exhibited  good  D— efficiences  but  poor  G— efficiences 
as  p,  number  of  projected  factors,  increases.  However,  this  should  be  expected  be¬ 
cause  as  p  increases,  the  size  of  CCDs  increases  while  the  size  for  the  OA(N,  3k ) 
designs  is  fixed  for  any  p. 
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Improving  on  the  designs  of  CW,  Xu  et  al.  (2004),  hereafter  referred  to  by  XCW, 
proposed  a  combinatorial  method  for  constructing  new  and  efficient  OA  designs  and  a 
design  selection  approach  based  upon  a  projection  aberration  criterion  which  combines 
the  generalized  word-length  pattern  of  the  generalized  minimum  aberration  criterion 
(Xu  and  Wu,  2001)  for  factor  screening  and  the  projection-efficiency  criteria  (Cheng 
and  Wu,  2001)  for  interaction  detection.  XCW  assessed  the  projection  performance 
of  three  combinatorially  non-isomorphic  0/1(18, 3 ')s  and  three  combinatorially  non¬ 
isomorphic  0/L( 27,  313)s.  Their  three-step  approach  involves:  (1)  screening  out  poor 
orthogonal  arrays  (OA)  for  factor  screening  using  the  generalized  word-length  pattern, 
(2)  applying  the  projection  aberration  criterion  to  select  a  best  design  from  step  1, 
and  (3)  determining  the  best  level  permutations  of  the  design  from  step  2  to  improve 
design  projection  eligibility  and  estimation  efficiency  under  the  second-order  model 
1.2. 

Ye  et  al.  (2007),  hereafter  referred  to  as  YTL,  also  examined  3-level  18-run  and  27- 
run  orthogonal  designs.  However,  in  addition  to  considering  the  projection  properties 
of  designs,  their  design  choices  were  based  on  both  model  estimation  and  model 
discrimination  criteria.  The  two  model  estimation  criteria  employed  examine  the 
proportion  of  estimable  models,  Estimation  Capacity  (EC),  and  average  D— efficiency 
of  all  models,  Information  Capacity  (IC).  Defining  the  design  D,  the  space  of  models 
F  over  D  with  F',  F'  C  F,  the  subset  of  estimable  models  over  D  and  ficF'  then 
Jones  et  al.  (2007)  proposed  six  non-Bayesian  criteria  for  model  discrimination  of 
which  YTL  employed  the  Average  Expected  Prediction  Differences  (AEPD) 

AEP D  -  — £  £(||y,-yil|2|||y||  =  l)  (2.4) 

V2J  f ifjtF'iD) 
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and  Minimum  Maximum  Prediction  Difference  (MMPD) 


MM  PD  =  mini<i<j<nmax||y=i||  ||y;  -  yj  II  (2.5) 

where  d'  is  the  number  of  estimable  models,  y  is  the  response  vector,  and  yi  is  the 
fitted  value  of  the  ith  model  Ye  et  al.  (2007). 

While  previous  work  focused  primarily  on  the  design’s  projection  capacity,  Ed¬ 
wards  and  Truong  (2011)  applied  the  Jones  and  Nachtsheim  (2011b)  method  for 
finding  efficient  designs  with  minimal  aliasing  between  main  effects  and  two-factor 
interactions.  Deemed  MA  designs,  Edwards  and  Truong  (2011)  constructed  18,  27, 
and  30-run  designs  for  simultaneous  screening  and  response  surface  optimization  for 
k  =  4  to  7,  k  =  4  to  13,  and  k  —  6  to  14  factors,  respectively,  by  minimizing  the  sum 
of  squares  of  the  elements  of  the  alias  matrix,  A,  subject  to  a  lower  bound  on  the 
primary  model  D— efficiency.  The  optimization  of  interest  is 

min(iTr[A(d)/A((i)],  subject  to De(d)  >  Id,  (2.6) 

where  A (d)  is  the  alias  matrix  for  design  d,  De(d)  is  the  D-efficiency  of  design  d,  and 
Id  denotes  the  lower  bound  for  D-efficiency  and  0  <  Id  <  1  (Jones  and  Nachtsheim, 
2011b).  Edwards  and  Truong  (2011)  compared  the  27-run  orthogonal  arrays  of  XCW 
and  YTL  with  MA  designs  generated  with  Id  values  of  0.8  and  0.9  in  terms  of  in¬ 
efficiency  of  projection  and,  via  a  simulation  study,  the  proportion  of  active  factors 
declared  significant  (Power  1)  and  the  proportion  of  simulations  in  which  only  the 
true  active  factors  are  declared  significant  (Power  2).  Although  ranked  last  in  terms 
of  D-efficiency,  the  MA  designs  showed  superior  performance  with  their  ability  to 
detect  active  factors  (Edwards  and  Truong,  2011). 
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A  common  thread  connecting  all  CW,  XCW,  YTL,  and  MA  designs  is  the  use  of  a 
linear  and  quadratic  main-effects  only  analysis  for  factor  screening.  Unfortunately,  if 
the  strong  effect  heredity  principle  fails  to  hold  important  interactions  can  be  missed 
leading  to  a  misspecihed  response  surface  model.  Edwards  and  Truong  (2011)  con¬ 
firmed  this  assertion  for  their  designs  and  the  XCW  designs  through  a  simulation 
model  possessing  only  weak  effect  heredity.  All  four  research  efforts,  CW,  XCW, 
YTL,  and  MA,  acknowledge  that  while  the  strong  effect  heredity  assumption  could 
be  overly  restrictive,  they  feel  the  inclusion  of  quadratic  main  effects  diminishes  the 
concern.  However,  if  the  concern  exists  where  a  factor’s  significance  is  only  present  in 
interactions  with  other  factors,  the  authors  proposed  either  the  Bayesian  approaches 
of  Box  and  Meyer  (1993)  or  Chipman  et  al.  (1997)  to  account  for  significant  factors 
outside  of  main  effects  when  the  strong  effect  heredity  principle  fails  to  hold  (Cheng 
and  Wu,  2001).  Unfortunately,  these  methods  are  not  readily  available  to  practition¬ 
ers  in  statistical  software  packages  and  are  computationally  intensive  procedures,  thus 
likely  making  their  use  impractical  (Edwards  and  Truong,  2011).  Therefore  potential 
research  efforts  could  focus  on  new  or  inventive  analysis  techniques. 

Another  area  of  concern  for  the  CW,  XCW,  YTL,  and  MA  designs  is  the  projection 
of  main  and/or  quadratic  effects  deemed  significant  during  the  first  stage  analysis 
does  not  always  yield  a  second-order  design.  CW  highlighted  this  concern  using 
an  illustrative  example  of  a  27-run  experiment  with  nine  continuous  factors  (39-6). 
During  the  main  and  quadratic  effects  screening  analysis,  CW  identified  five  important 
factors,  unfortunately,  there  are  no  eligible  projected  designs  of  five  factors  in  the 
39~6  design.  As  a  result,  a  subset  of  the  five  factors  must  be  considered  when  a  single 
experiment  is  used  and  important  effects  could  be  missed. 

Edwards  and  Mee  (2011)  introduced  new  spherical  Fractional  Box-Behnken  de¬ 
signs  (FBBD)  aimed  at  overcoming  the  projection  deficiencies  and  main/quadratic 
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effect  only  analysis  issues  found  in  the  CW/XCW/YTL/MA  designs.  The  FBBD 
provide  the  ability  to  explore  interactions  during  the  screening  stage  and  to  fit  second- 
order  models  via  a  backward  elimination  analysis  strategy  to  each  of  the  (k  —  l)-factor 
projections.  Edwards  and  Mee  (2011)  questioned  the  applicability  of  the  factor  spar¬ 
sity  principle  assumed  by  CW/XCW/YTL/MA  designs  preferring  instead  the  idea  of 
effect  sparsity  when  many  factors  are  under  consideration.  By  effect  sparsity,  Edwards 
and  Mee  (2011)  meant  the  number  of  active  effects  vice  factors  is  relatively  small. 
Since  it  is  possible  for  effect  sparsity  to  hold,  even  when  factor  sparsity  does  not, 
Edwards  and  Mee  (2011)  determined  it  was  necessary  to  search  for  designs  having 
larger  factor  eligible  projections  than  the  maximum  p  =  5  factor  projections  provided 
with  the  CW/XCW/YTL/MA  designs. 

The  FBBDs  are  developed  by  taking  subsets  of  the  two-level  fractional  factorial 
designs  which  compose  a  BBD  (Edwards  and  Mee,  2011).  The  number  of  runs  as¬ 
sociated  with  the  FBBD  vary  depending  upon  the  number  of  factors  involved.  For 
k  <  9,  the  FBBD  are  saturated/near-saturated  response  surface  designs,  but  for 
k  =  10 ...  13,  the  FBBD  are  reduced  run  designs.  While  FBBDs  require  more  runs 
than  CW/XCW/YTL/MA  designs,  their  ease  of  construction  and  aliasing  structure 
facilitate  an  analysis  strategy  which  cannot  be  applied  to  the  CW/XCW/YTL/MA 
designs.  Additionally,  as  k  increases,  the  FBBD  designs  require  fewer  runs  than 
CCD/BBD. 

Jones  and  Nachtsheim  (2011a)  introduced  a  class  of  three-level  designs  referred  to 
as  “definitive  screening  designs”  where  main  effects  are  not  biased  by  second-order 
effects  and  all  quadratic  effects  are  estimable.  Consisting  of  2k  +  1  runs  for  k  factors, 
these  designs  were  constructing  using  the  same  Jones  and  Nachtsheim  (2011b)  method 
used  by  Edwards  and  Truong  (2011). 
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2.7  Chapter  Summary 


Experimental  designs  normally  recommended  for  screening  and  optimization  ex¬ 
periments  differ.  If  an  experimenter  can  afford  to  run  only  one  experiment,  a  choice 
must  be  made  between  one  objective  or  the  other.  If  the  experimenter  chooses  a 
classical  response  surface  design,  a  subset  of  the  factors  must  be  selected  to  work 
with  and  the  chance  of  missing  other  important  factors  and  improvement  possibilities 
increases.  If  the  experimenter  decides  to  conduct  a  screening  experiment,  important 
interactions  and  quadratic  effects  may  be  missed  that  could  lead  to  further  process 
improvements  and  cost  reductions. 

Examination  of  classical  response  surface  designs  show  the  CCD  and  FCD  as 
efficient  second-order  designs,  particularly  when  compared  with  the  3k  factorial,  which 
can  accommodate  a  spherical  and  cuboidal  region,  respectively,  through  appropriate 
design  parameter  selection  while  not  requiring  an  unusually  large  number  of  design 
points.  The  efficiency  of  the  second-order  BBD  is  comparable  to  the  CCD.  However, 
the  BBD  only  accommodates  a  spherical  design  region  ignoring  the  “extreme”  corner 
factorial  points.  This  can  be  beneficial  if  the  operational  region  does  not  permit 
corner  points.  Unfortunately,  as  the  number  of  factors  k  increases,  so  does  the  run 
size  of  these  designs.  Whereas  both  the  CCD  and  FCD  can  reduce  their  run  size 
requirements  through  the  use  of  fractional  factorials,  while  sacrificing  the  number  of 
estimable  effects,  the  ability  to  reduce  run  size  requirements  for  BBD  has  seen  little 
work.  One  proposal  makes  use  of  replacing  the  standard  23  designs  in  a  BBD  with  a 
combination  of  23  and  23//1  designs.  However,  the  reduction  in  run  size  requirements 
has  a  corresponding  reduction  in  parameter  estimation  efficiency  (Zhang  et  ah,  2011). 

When  cost  constraints  restrict  the  design  size  to  levels  at  or  equal  to  the  number  of 
parameters  in  a  second-order  design,  hybrid  designs  for  a  spherical  region  and  Hoke 
designs  for  a  cuboidal  region  are  typically  a  better  option  than  either  Box-Draper 
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or  SCD  because  they  generally  provide  better  efficiency  (Myers  and  Anderson-Cook, 
2009).  Unfortunately,  hybrid  and  Hoke  designs  have  only  been  specified  for  up  to 
k  =  7  and  6  factors  respectively.  This  could  inhibit  their  usefulness,  particularly  if 
a  factor  screening  design  cannot  be  used  to  eliminate  insignificant  factors.  Hybrid 
designs  currently  involve  the  use  of  a  2k  factorial.  A  potential  expansion  of  hybrid 
designs  could  include  using  2k~p  factorial  designs  for  large  values  of  k.  Additionally, 
since  these  designs  are  currently  using  full  factorials,  the  idea  of  design  projectivity 
of  the  reduced  run  designs  could  be  addressed.  Unfortunately,  the  nature  of  hybrid 
designs  results  in  odd  factor  levels  which  could  be  difficult  to  obtain  in  practice.  It 
is  in  this  instance  that  computer  generated  designs  can  prove  most  useful  for  even 
saturated  designs. 

Contrary  to  classical  response  surface  designs,  examination  of  screening  designs 
finds  focus  primarily  on  examining  main  effects,  potentially  at  the  expense  of  impor¬ 
tant  interactions  and  quadratic  effects.  While  there  is  an  abundant  amount  of  research 
dealing  with  the  construction  methods,  evaluation  criteria,  and  data  analysis  meth¬ 
ods  used  to  identify  the  important  effects  for  supersaturated  designs,  the  majority  of 
research  is  based  upon  the  underlying  response  model  being  linear  (first-order)  in  na¬ 
ture.  Matsu ura  et  al.  (2011)  constructed  a  supersaturated  design  using  a  Hadamard 
matrix  and  proposed  its  use  in  robust  parameter  design.  The  design  was  compared 
with  a  D— optimal  design,  a  Central  Composite  design,  and  a  Box-Behnken  design. 
With  considerably  fewer  experimental  runs,  as  compared  with  CCDs  and  BBDs,  the 
design  demonstrated  the  capacity  to  identify  main,  two-factor  interaction,  and  pure 
quadratic  effects  of  active  factors  under  the  effect  sparsity  assumption.  Matsuura 
et  al.  (2011)  work  currently  is  the  only  published  findings  which  directly  address  the 
consideration  of  quadratic  effects. 
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In  the  instances  where  neither  a  single  screening  nor  a  response  surface  design  can 
fulfill  experimental  objectives,  second-order  screening  designs  have  been  proposed 
which  screen  factors  beyond  only  main  effects  and  provide  the  capacity  to  estimate  a 
second-order  response  model  within  a  subset  of  the  original  factors,  simultaneously. 
In  order  to  do  so,  assumptions  such  as  factor  or  effect  sparsity  and  effect  heredity, 
to  a  varying  degree,  are  made  to  facilitate  data  analysis.  To  what  degree  these 
assumptions  hold  has  been  debated  and  therefore  could  influence  the  use  of  these 
designs.  As  a  result,  a  thorough  comparison  of  these  designs,  as  these  assumptions 
are  relaxed,  could  provide  insight  into  the  various  designs’  robustness. 
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III.  Effect  of  Heredity  and  Sparsity  on  Second-Order 
Screening  Design  Performance 


3.1  Introduction 

Box  and  Wilson  (1951)  laid  the  foundation  for  response  surface  methodology 
(RSM)  by  outlining  a  philosophy  of  sequential  experimentation  which  included  ex¬ 
periments  for  screening,  region  seeking  (such  as  steepest  ascent),  process/product 
characterization,  and  process/product  optimization  (Myers  et  ah,  2004).  Box  and 
Liu  (1999)  illustrated  a  number  of  concepts  which  Box  understood  as  the  embodi¬ 
ment  of  RSM  at  the  time  to  include  the  philosophy  of  sequential  learning. 

In  contrast  with  the  traditional  sequential  design  approach  of  RSM,  recent  lit¬ 
erature  has  proposed  employing  a  single  experimental  design  capable  of  preforming 
both  factor  screening  and  response  surface  exploration  when  conducting  multiple  ex¬ 
periments  is  unrealistic  due  to  time,  budget,  or  other  constraints.  For  instance,  in 
agriculture  the  time  required  to  collect  data  specified  by  the  design  can  be  exceed¬ 
ingly  long.  Within  a  manufacturing  setting  experimental  preparation  can  be  overly 
time-consuming.  Directly  applicable  to  the  DoD,  Lawson  (2003)  points  out  fixed 
deadlines  for  scale  up  and  production  of  prototype  engineering  designs  may  not  allow 
the  possibility  of  follow-up  experimentation. 

Military  systems,  particularly  aerodynamic  systems,  are  complex.  It  is  not  unusual 
for  these  systems  to  exhibit  nonlinear  behavior.  Developmental  testing  may  be  tasked 
to  characterize  the  nonlinear  behavior  of  such  systems  but  are  also  restricted  in  how 
much  testing  can  be  accomplished.  In  these  instances,  the  single  experimental  design 
may  be  the  preferred  approach. 
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Second-order  screening  design  methodology,  sometimes  referred  to  as  One-Step 
RSM  or  Definitive  Screening,  is  a  relatively  new  focus  in  statistical  research  and  ef¬ 
fectively  unknown  to  the  defense  test  community.  Important  questions  as  to  the  meth¬ 
ods’  usefulness  and  applicability  to  Defense  testing  remain  unaddressed  but  nonethe¬ 
less  second-order  screening  designs  for  nonlinear  system  responses  provide  a  means 
to  effectively  focus  test  resources  onto  those  factors  driving  system  performance. 

Two  important  principles  used  in  developing  successful  screening  designs  are  spar¬ 
sity  and  heredity.  The  sparsity  principle  stems  from  the  Pareto  principle  which  states 
that  most  of  the  variability  in  a  system  or  process  output  is  due  to  a  small  number 
of  inputs.  Traditionally,  factor  sparsity  has  led  to  the  assumption  in  screening  de¬ 
signs  that  only  a  small  number  of  factors  are  present  among  the  actual  model  terms. 
However,  the  degree  to  which  factor  sparsity  holds  as  the  number  of  factors  grows 
has  resulted  in  a  debate  between  effect  sparsity  and  factor  sparsity.  Effect  sparsity 
indicates  that  the  number  of  active  effects  compared  to  active  factors  is  relatively 
small.  Therefore,  it  is  possible  for  the  effect  sparsity  assumption  to  hold  while  factor 
sparsity  does  not. 

Heredity,  either  strong  or  weak,  is  another  screening  principle  commonly  used 
when  considering  model  selection.  Strong  heredity  implies  that  if  a  model  includes 
a  two-factor  interaction,  then  both  its  constituent  main  effects  are  also  included  in 
the  model.  Conversely,  weak  heredity  requires  only  one  of  the  two  constituent  main 
effects  be  included  in  the  model. 

Edwards  and  Truong  (2011)  preformed  a  simulation  study  examining  several 
second-order  screening  designs  focusing  on  each  design’s  ability  to  correctly  iden¬ 
tify  active  factors  under  a  variety  of  conditions.  While  5000  responses  vectors  were 
simulated  for  several  combinations  of  coefficient  magnitudes,  the  truth  models  used 
assumed  both  factor  sparsity  and  strong  effect  heredity.  This  article  formally  ex- 
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amines  the  robustness  of  the  two  arguably  best  second-order  screening  designs  with 
respect  to  the  assumptions  of  both  sparsity  (factor  or  effect)  and  heredity  (strong  or 
weak) . 

The  remainder  of  this  paper  is  organized  as  follows.  Section  3.2  discusses  the 
literature  relevant  to  second-order  screening  designs.  In  Section  3.3,  we  present  an 
empirical  study  that  quantifies  the  robustness  of  the  two  second-order  screening  de¬ 
signs  to  assumptions  of  heredity  and  sparsity.  Section  3.4  provides  a  discussion  on 
the  tradeoffs  in  selecting  among  the  two  designs  and  Section  3.5  presents  a  summary 
of  the  conclusions. 

3.2  Second  Order  Screening  Designs 

Initial  attempts  to  use  response  surface  designs  capable  of  performing  both  fac¬ 
tor  screening  and  response  surface  exploration  with  a  single  design  relied  upon  the 
design’s  projection  capacity. 

Cheng  and  Wu  (2001),  hereafter  referred  to  as  CW,  introduced  a  two-stage  analysis 
method  where  the  first  stage  consisted  of  performing  factor  screening  analysis  to 
identify  important  factors  and  the  second  stage  involved  fitting  a  second-order  model 
by  assuming  both  factor  sparsity  and  strong  effect  heredity  held  and  that  the  region 
chosen  for  factor  screening  contained  the  optimal  response  surface  area. 

For  the  first  stage,  CW  recommended  a  main  effects  analysis  method  for  simplicity 
purposes.  The  key  linkage  between  stage  one  and  two  was  the  ability  to  project  the 
initial  larger  factor  space  onto  a  smaller  factor  space  capable  of  fitting  a  second-order 
model.  When  the  factor  sparsity  principle  holds,  any  regular  fractional  factorial 
design  of  resolution  R  projects  onto  any  subset  of  R  —  1  factors  as  a  full  factorial.  For 
example,  a  23n f  design  ( R  =  3)  can  project  into  a  22  design  (Myers  and  Anderson- 
Cook,  2009).  This  projection  property  extends  to  nonregular  designs  like  Plackett- 
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Burman  designs  as  discussed  by  both  Lin  and  Draper  (1992)  and  Wang  and  Wu 
(1995). 

Since  a  larger  design  can  project  onto  many  different  combinations  of  factors, 
a  projection-efficiency  criterion  was  developed  to  compare  orthogonal  designs  based 
upon  (1)  the  number  of  eligible  projected  designs  with  lower-dimension  projections 
being  more  important  than  higher-dimension  projections  and  (2)  the  estimation  ef¬ 
ficiency  for  eligible  projected  designs  determined  by  the  ratio  of  each  design’s  D— 
and  G— efficiences  (Cheng  and  Wu,  2001).  Eligible  designs  are  designs  which  can  fit 
a  second-order  model  and  have  calculated  D—  and  G— efficiences,  denoted  Deff  and 
Geff,  respectively,  to  compare  the  performance  of  that  design  against  a  corresponding 
optimal  design  (Myers  and  Anderson-Cook,  2009). 

CW  studied  three  orthogonal  array  (OA)  designs  (OA(18,37),  OA( 27,  38),  and 
OA{ 36,  312))  each  of  which  demonstrated  desirable  projection  properties.  The  OA(N,  3k ) 
connotation  shows  the  design’s  number  of  runs  N  and  number  of  factors  k.  In  con¬ 
trast  to  3n~k  designs  which  have  defining  contrast  subgroups  to  describe  the  design 
structure,  the  OA(N,  3k )  designs  studied  by  CW  required  computer  search  to  classify 
the  possible  projected  designs.  Fortunately,  while  design  generation  is  more  complex, 
the  overall  projection  properties  are  better  and  generally  required  less  experimental 
runs.  When  compared  to  Central  Composite  Designs  (CCD),  the  OA(N,3k)  designs 
studied  exhibited  good  D— efficiences  but  poor  G— efficiences  as  p,  number  of  pro¬ 
jected  factors,  increases.  However,  this  is  to  be  expected  because  as  p  increases,  the 
size  of  CCDs  increases  while  the  size  for  the  OA(N,  3k)  designs  is  fixed  for  any  p. 

Improving  on  the  designs  of  CW,  Xu  et  al.  (2004),  hereafter  referred  to  by  XCW, 
proposed  a  combinatorial  method  for  constructing  new  and  efficient  OA  designs  and  a 
design  selection  approach  based  upon  a  projection  aberration  criterion  which  combines 
the  generalized  word-length  pattern  of  the  generalized  minimum  aberration  criterion 
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(Xu  and  Wu,  2001)  for  factor  screening  and  the  projection-efficiency  criteria  (Cheng 
and  Wu,  2001)  for  interaction  detection.  XCW  assessed  the  projection  performance 
of  three  combinatorially  non-isomorphic  0/1(18, 37)s  and  three  combinatorially  non¬ 
isomorphic  OA(27,  313)s.  Their  three-step  approach  involves:  (1)  screening  out  poor 
orthogonal  arrays  for  factor  screening  using  the  generalized  word-length  pattern,  (2) 
applying  the  projection  aberration  criterion  to  select  a  best  design  from  step  (1),  and 
(3)  determining  the  best  level  permutations  of  the  design  from  step  (2)  to  improve 
design  projection  eligibility  and  estimation  efficiency  under  the  second-order  model: 

k  k  k —  1  k 

r)  =  A)  +  ^2  PiXi  +  ^2  Pnxi  +  ^2  X]  ^3XiXr  (3.1) 

2=1  2=1  2=1  J=2-f  1 

Ye  et  al.  (2007),  hereafter  referred  to  as  YTL,  also  examined  3-level  18-run  and  27- 
run  orthogonal  designs.  However,  in  addition  to  considering  the  projection  properties 
of  designs,  their  design  choices  were  based  on  both  model  estimation  and  model 
discrimination  criteria.  The  two  model  estimation  criteria  employed  examine  the 
proportion  of  estimable  models,  Estimation  Capacity  (EC),  and  average  D— efficiency 
of  all  models,  Information  Capacity  (IC).  Defining  the  design  D,  the  space  of  models 
F  over  D  with  F' ,  F'  C  F,  the  subset  of  estimable  models  over  D  and  f)eF'  then 
Jones  et  al.  (2007)  proposed  six  non-Bayesian  criteria  for  model  discrimination  of 
which  YTL  employed  the  Average  Expected  Prediction  Differences  (AEPD) 

aep d  —  -b  £  £(l|yi -yjll2ll|y||  =  i)  (3.2) 

V  2  /  f i,f,-eF'(D) 

and  Minimum  Maximum  Prediction  Difference  (MMPD) 

MM  PD  =  mini<i<i<nmax||y=i||  ||yi  -  yj||  (3.3) 
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where  d!  is  the  number  of  candidate  models,  y  is  the  response  vector,  and  y;  is  the 
fitted  value  of  the  ith  model  (Ye  et  al.,  2007). 

While  previous  work  focused  primarily  on  the  design’s  projection  capacity,  Ed¬ 
wards  and  Truong  (2011)  applied  the  Jones  and  Nachtsheim  (2011b)  method  for 
finding  efficient  designs  with  minimal  aliasing  between  main  effects  and  two-factor 
interactions.  Deemed  MA  designs,  Edwards  and  Truong  (2011)  constructed  18,  27, 
and  30-run  designs  for  simultaneous  screening  and  response  surface  optimization  for 
k  =  4  to  7,  k  =  4  to  13,  and  k  —  6  to  14  factors,  respectively,  by  minimizing  the  sum 
of  squares  of  the  elements  of  the  alias  matrix,  A,  subject  to  a  lower  bound  on  the 
primary  model  D— efficiency.  Their  optimization  of  interest  is 

mindTr[A((i)/A(d)], subject  to De(d)  >  Id,  (3.4) 

where  A (d)  is  the  alias  matrix  for  design  d,  De(d)  is  the  D-efficiency  of  design  d,  and 
Id  denotes  the  lower  bound  for  D-efficiency  with  0  <  Id  <  1  (Jones  and  Nachtsheim, 
2011b).  Edwards  and  Truong  (2011)  compared  the  projection  D-efficiency  and,  via 
a  simulation  study,  the  proportion  of  active  factors  declared  significant  (Power  1) 
and  the  proportion  of  simulations  in  which  only  the  true  active  factors  are  declared 
significant  (Power  2)  of  the  27-run  orthogonal  arrays  of  XCW,  YTL,  and  MA  designs 
generated  with  Id  values  of  0.8  and  0.9.  Although  ranked  last  in  terms  of  ^-efficiency, 
the  MA  designs  showed  superior  performance  in  their  ability  to  detect  active  factors 
(Edwards  and  Truong,  2011). 

A  common  thread  connecting  all  CW,  XCW,  YTL,  and  MA  designs  is  the  use  of  a 
linear  and  quadratic  main-effects  only  analysis  for  factor  screening.  Unfortunately,  if 
the  strong  effect  heredity  principle  fails  to  hold  important  interactions  can  be  missed 
leading  to  a  misspecihed  response  surface  model.  Edwards  and  Truong  (2011)  con¬ 
firmed  this  assertion  for  their  designs  and  the  XCW  designs  using  a  simulation  model 
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possessing  only  weak  effect  heredity.  All  four  research  efforts,  CW,  XCW,  YTL,  and 
MA,  acknowledge  that  while  the  strong  effect  heredity  assumption  could  be  overly 
restrictive,  they  feel  the  inclusion  of  quadratic  main  effects  diminishes  the  concern. 
However,  Edwards  and  Truong  (2011)  proposed  either  the  Bayesian  approaches  of 
Box  and  Meyer  (1993)  or  Chipman  et  ah  (1997)  to  account  for  significant  factors 
outside  of  main  effects  when  the  strong  effect  heredity  principle  fails  to  hold  (Cheng 
and  Wu,  2001).  Unfortunately,  these  methods  are  not  readily  available  in  statistical 
software  packages  and  are  computationally  intensive  procedures,  thus  likely  limiting 
widespread  use  (Edwards  and  Truong,  2011).  Therefore  potential  research  efforts 
could  focus  on  new  analysis  techniques. 

Another  area  of  concern  for  the  CW,  XCW,  YTL,  and  MA  designs  is  that  the 
projection  of  main  and/or  quadratic  effects  deemed  significant  during  the  first  stage 
analysis  do  not  always  yield  a  second-order  design.  CW  highlighted  this  concern  using 
an  illustrative  example  of  a  27-run  experiment  with  nine  factors  (39-6).  During  the 
main  and  quadratic  effects  screening  analysis,  CW  identified  five  important  factors; 
unfortunately,  there  are  no  eligible  projected  designs  of  five  factors  in  the  39~6  design. 
As  a  result,  a  subset  of  the  five  important  factors  must  be  considered  in  order  to  have 
enough  degrees  of  freedom  to  fit  a  full  second-order  model.  Since  not  all  the  important 
factors  can  be  used  important  effects  could  be  missed. 

Edwards  and  Mee  (2011)  introduced  new  spherical  Fractional  Box-Behnken  de¬ 
signs  (FBBD)  aimed  at  overcoming  the  projection  deficiencies  and  main/quadratic 
effect  only  analysis  issues  found  in  the  CW /XCW /YTL/MA  designs.  The  FBBD  pro¬ 
vides  the  ability  to  explore  interactions  during  the  screening  stage  and  to  fit  second- 
order  models  via  a  backward  elimination  analysis  strategy  to  each  of  the  (k  —  l)-factor 
projections.  Edwards  and  Mee  (2011)  questioned  the  applicability  of  the  factor  spar¬ 
sity  principle  assumed  by  CW/XCW/ YTL/MA  designs  preferring  instead  the  idea  of 
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effect  sparsity  when  many  factors  are  under  consideration.  Since  it  is  possible  for  effect 
sparsity  to  hold,  even  when  factor  sparsity  does  not,  Edwards  and  Mee  (2011)  deter¬ 
mined  it  was  necessary  to  search  for  designs  having  larger  factor  eligible  projections 
than  the  maximum  p  =  5  factor  projections  provided  with  the  CW/XCW/YTL/MA 
designs. 

The  FBBDs  are  developed  by  taking  subsets  of  the  two-level  fractional  factorial 
designs  which  compose  a  Box-Behnken  Design  (BBD)  (Edwards  and  Mee,  2011). 
The  number  of  runs  associated  with  the  FBBD  vary  depending  upon  the  number 
of  factors  involved.  For  k  <  9,  the  FBBD  are  saturated/near-saturated  response 
surface  designs,  but  for  k  —  10,  . . .,  13,  the  FBBD  are  reduced  run  designs.  While 
FBBDs  require  more  runs  than  CW/XCW/YTL/MA  designs,  their  ease  of  construc¬ 
tion  and  aliasing  structure  facilitate  an  analysis  strategy  which  cannot  be  applied  to 
the  CW/XCW/YTL/MA  designs.  Additionally,  as  k  increases,  the  FBBD  designs 
require  fewer  runs  than  CCD/BBD  designs. 

Jones  and  Nachtsheim  (2011a)  introduced  a  class  of  three-level  designs  referred  to 
as  “definitive  screening  designs”  where  main  effects  are  not  biased  by  second-order 
effects  and  all  quadratic  effects  are  estimable.  Consisting  of  2k  +  1  runs  for  k  factors, 
these  designs  were  constructed  using  the  same  Jones  and  Nachtsheim  (2011b)  method 
used  by  Edwards  and  Truong  (2011). 

3.3  Empirical  Study 

Our  empirical  study  examines  the  nine-factor  designs,  identified  as  (1/2)559.1  in 
Table  4  of  Edwards  and  Mee  (2011)  and  the  definitive  screening  design  generated  using 
conference  matrices  based  on  Xiao  et  al.  (2012).  The  study  focus  is  on  each  design’s 
robustness  to  detect  important  effects  in  models  exhibiting  different  combinations  of 
heredity  and  sparsity.  A  single  replication  is  investigated  in  depth  for  each  scenario 
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where  the  truth  model  terms  and  coefficients  were  chosen  to  be  a  variation  on  the 
original  nine-factor  model  considered  by  Edwards  and  Mee  (2011).  Additionally,  each 
design  is  analyzed  using  the  author’s  recommended  analysis  methodologies.  Summary 
statistics  involving  replications  for  each  model  are  then  provided  and  discussed. 

The  Edwards  and  Mee  (2011)  analysis  methodology  involves  performing  a  factor- 
based  backward  elimination  to  identify  a  possible  second-order  model.  Since  there 
are  not  enough  degrees  of  freedom  to  fit  a  full  second-order  model  for  all  the  factors 
under  consideration,  the  first  step  assumes  at  least  one  factor  can  be  omitted  from 
the  second-order  model  containing  all  k  =  9  factors.  Therefore  the  root  mean-square 
error  (RMSE)  for  each  of  the  9  {k  —  1  =  8)-factor  second-order  models  are  compared 
with  the  model  yielding  the  smallest  RMSE  being  selected  and  thus  identifying  which 
factor  is  omitted.  Subsequent  steps  involve  determining  if  any  additional  factors  may 
be  removed  based  upon  whether  all  effects,  to  include  main,  quadratic,  and  two-way 
interactions,  in  the  second  order  model  containing  the  factor  are  negligible. 

Jones  and  Nachtsheim  (2011a)  suggest  using  a  forward  stepwise  regression,  which 
considers  all  terms  in  a  second-order  model  of  k  —  9  factors.  With  a  p-value  of  0.1  to 
enter,  effects  are  added  into  the  second-order  model  while  forcing  a  strong  heredity 
model.  As  such,  when  either  two-factor  interactions  or  pure-quadratic  effects  are 
included  in  the  model,  the  lower  order  terms  must  also  be  included. 

Four  cases  are  considered  to  represent  different  combinations  of  model  heredity 
(strong  or  weak)  and  sparsity  (factor  or  effect).  In  addition,  each  model  is  examined 
with  four  different  noise  level  scenarios.  The  noise  level  vector  used  in  each  scenario 
is  identical  across  each  model  for  each  design.  The  49  treatment  combinations  for 
the  Edwards  and  Mee  (2011)  design  and  the  21  treatment  combinations  for  the  Jones 
and  Nachtsheim  (2011a)  design  are  given  in  Tables  11  and  12,  respectively.  Tables 
13  and  14  show  the  simulated  response  values  for  the  16  combinations  of  case  and 
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noise  level  scenario  for  Edwards  and  Mee  (2011)  and  Jones  and  Nachtshcim  (2011a), 
respectively. 

Case  1  data  was  simulated  based  on  the  model 


t/i  —  2Ai  —  1.5 Ei  +  2 Gi  —  3A^  +  2.5 E?  —  4 Gff  +  AA^Ei  +  3.5Aj(7i  —  5 EiGi  +  £j,  (3.5) 

thereby  representing  a  response  which  exhibits  factor  sparsity  and  strong  heredity 
between  active  two-factor  interactions  or  pure  quadratic  effects  and  main  effects. 
The  model  exhibits  factor  sparsity  as  only  3  of  the  9  factors  are  active  within  the  9 
effects  contained  in  the  model. 

The  Edwards  and  Mee  (2011)  analysis  method  first  performs  a  factor-based  back¬ 
ward  elimination  to  identify  a  possible  second-order  model.  Table  15  shows  the  back¬ 
ward  elimination  steps  for  the  Table  13,  Case  1  data  for  all  four  noise  level  scenarios 
using  a  =  0.05. 

For  example,  when  considering  Case  1,  Scenario  3,  where  £i  ~  1V(0,3),  the  eight- 
factor  second-order  model  that  omits  F  has  the  smallest  RMSE  among  all  the  eight- 
factor  models  and  factor  J  contributions  as  a  main,  quadratic,  or  as  a  part  of  a 
two-way  interaction  are  negligible. 

After  identifying  which  factors  can  be  removed,  Edwards  and  Mee  (2011)  fit  a  full 
second-order  model  in  the  remaining  factors.  Again  when  considering  scenario  three, 
a  full  second-order  model  is  fit  using  the  remaining  factors:  A,  B,  C,  D,  E,  G,  and  H. 

In  contrast  to  Edwards  and  Mee  (2011)  factor-based  backward  elimination  analysis 
method,  Jones  and  Nachtshcim  (2011a)  perform  forward  stepwise  regression  with  a 
p- value  of  0.1  to  enter  while  forcing  a  strong  heredity  model.  Table  16  shows  the 
forward  stepwise  regression  steps  for  the  Case  1  data  for  all  four  noise  level  scenarios. 

Since  the  “combined”  option  rule  is  used  for  the  forward  stepwise  regression,  the 
inclusion  of  two-way  interaction  or  pure  quadratic  effects  result  in  the  inclusion  of 
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Table  11.  Nine-Factor  Fractional  Box-Behnken  Design  (FBBD) 
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Table  12.  Nine-Factor  Definitive  Screening  Design  (DSD) 
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Table  13.  Nine-Factor  FBBD  Simulated  Response 
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Table  15.  FBBD  Stepwise  Backward  Elimination  Results:  Case  1 


Scenario 

£  ~  iV( 0, 1) 

£  ~  N( 0,2) 

£  ~  N( 0,3) 

£  ~  N(0,  5) 

Step 

Factors  Removed 

1 

H 

J 

F 

E 

2 

- 

F 

J 

F 

3 

- 

- 

- 

H 

4 

- 

- 

- 

D 

5 

- 

- 

- 

C 

6 

- 

- 

- 

B 

Table  16.  DSD  Forward  Stepwise  Results:  Case  1 


Scenario 

£  ~  iV(0, 1) 

£  ~  A(0,  2) 

£  ~  A(0,  3) 

£  ~  N( 0,  5) 

Step 

Effects  Added 

1 

EG 

AG 

EG 

CH 

2 

AG 

EG 

AG 

EJ 

3 

AE 

EJ 

AE 

F 2 

4 

G2 

AE 

DF 

AG 

5 

AJ 

- 

H2 

D 

6 

DH 

- 

- 

- 

7 

AD 

- 

- 

- 

8 

F 

- 

- 

- 

9 

C 

- 

- 

- 
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all  the  factors  which  comprise  the  two-way  interaction  or  pure  quadratic  effects.  For 
example,  when  considering  Scenario  3,  where  £ 

i  r's"/  iV(0,3),  the  EG  and  H 2  effects, 
which  entered  the  regression  model  in  steps  1  and  5,  respectively,  would  require  the 
E,  G,  and  H  factors  also  be  in  the  model. 

Case  2  data  was  simulated  using  the  model 


Di  —  2A i  —  1.5 Ei  +  2 Gi  +  4 Ci  —  3 Eli  +  2.5 E\  —  AGiHi  +  3.5 EiHi  —  hCiGi  +  £j,  (3.6) 

to  represent  a  response  exhibiting  effect  sparsity  and  strong  heredity  between  active 
two-factor  interactions  or  pure  quadratic  effects  and  their  associated  main  effects. 
The  model  is  considered  to  exhibit  effect  sparsity  because  although  over  50%  of  the 
factors  (5  of  9)  are  active  only  9  of  54  total  effects  are  active.  Cases  1  and  2  both  have 
the  same  number  of  active  effects  but  differ  in  the  number  of  active  factors  contained 
within  the  second-order  portion  of  the  model. 

Table  17  provides  Edwards  and  Mee  (2011)  factor-based  backward  elimination 
results  and  Table  18  Jones  and  Nachtsheim  (2011a)  forward  stepwise  regression,  re¬ 
spectively,  using  the  Case  2  response  data  associated  with  each  design  for  all  four 
noise  level  scenarios  in  Tables  13  and  14. 


Table  17.  FBBD  Stepwise  Backward  Elimination  Results:  Case  2 


Scenario 

£  ~  1V( 0, 1) 

£  ~  JV(0,2) 

£  ~  iV(0,  3) 

£  ~  iV(0,  5) 

Step 

Factors  Removed 

1 

H 

J 

F 

E 

2 

- 

F 

J 

B 

3 

- 

B 

- 

D 

Case  3  data  was  simulated  using  the  model 


yi  =  2 Ai  +  2 Ei  -  1.5 A2  +  2.5 E\  -  3.5 A&  +  AA,Gt  -  5 EtGi  + 


(3.7) 
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Table  18.  DSD  Forward  Stepwise  Results:  Case  2 


Scenario 

£  ~  IV  ( 0, 1) 

£  -  A(0,  2) 

£  ~  1V(0,  3) 

£  ~  N( 0,  5) 

Step 

Effects  Added 

1 

CG 

C2 

C2 

EH 

2 

GH 

GH 

E2 

CE 

3 

EH 

CG 

DH 

AE 

4 

A 

CJ 

A2 

CF 

5 

E2 

A 

DG 

AG 

6 

J 2 

GJ 

AF 

D 

7 

DH 

- 

- 

- 

8 

CF 

- 

- 

- 

thereby  representing  a  response  exhibiting  factor  sparsity  and  weak  heredity  between 
active  two-factor  interactions  or  pure  quadratic  effects  and  main  effects.  The  model 
exhibits  factor  sparsity  because  only  3  of  the  9  factors  are  active  within  the  7  effects 
contained  in  the  model.  Since  not  all  factors,  which  comprise  the  two-factor  interac¬ 
tions,  are  present  as  a  main  effect,  the  model  exhibits  weak  heredity.  For  instance, 
although  factor  G  is  significant  within  two  two-factor  interactions,  factor  G  by  itself 
is  not  significant. 

Table  19  provides  Edwards  and  Mee  (2011)  factor-based  backward  elimination 
results  and  Table  20  Jones  and  Nachtsheim  (2011a)  forward  stepwise  regression,  re¬ 
spectively,  using  the  Case  3  response  data  associated  with  each  design  for  all  four 
noise  level  scenarios  in  Tables  13  and  14. 

Case  4  data  was  simulated  using  the  model 


?/j  —  2Ai  —  1.5 Ei  +  2 Gi  —  3 H2  +  2.5 E2  +  4 Ai Ci  +  3.5 E^Hi  —  5 CiGi  —  4 GiHi  +  £,  (3.8) 

represents  a  response  exhibiting  effect  sparsity  and  weak  heredity  between  active  two- 
factor  interactions  or  pure  quadratic  effects  and  main  effects.  The  case  4  model  is 
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identical  to  the  model  used  by  Edwards  and  Mee  (2011).  However,  the  response  data 
differs  even  for  the  £*  ~  iV(0, 1)  scenario. 

Table  21  provides  Edwards  and  Mee  (2011)  factor-based  backward  elimination 
results  and  Table  22  Jones  and  Nachtsheim  (2011)  forward  stepwise  regression,  re¬ 
spectively,  using  the  Case  4  response  data  associated  with  each  design  for  all  four 

noise  level  scenarios  in  Tables  13  and  14. 

Table  19.  FBBD  Stepwise  Backward  Elimination  Results:  Case  3 


Scenario 

£  ~  iV( 0, 1) 

e  ~  1V(0,  2) 

£  ~  1V( 0,  3) 

£  ~  Ar(0,  5) 

Step 

Factors  Removed 

1 

H 

J 

A 

E 

2 

- 

F 

H 

C 

3 

- 

B 

F 

H 

4 

- 

- 

J 

B 

5 

- 

- 

C 

J 

6 

- 

- 

B 

- 

Table  20.  DSD  Forward  Stepwise  Results:  Case  3 


Scenario 

£  ~  1V(0, 1) 

£  ~  JV(0,2) 

£  ~  A(0,3) 

£  ~  JV(0,5) 

Step 

Effects  Added 

1 

AE 

AE 

EG 

F 2 

2 

BF 

CH 

AG 

E2 

3 

J2 

EJ 

AE 

FH 

4 

A2 

J 2 

DF 

AG 

5 

FH 

E2 

H 

DE 

6 

DJ 

- 

AF 

- 

7 

E2 

- 

- 

- 

3.4  Case  Comparison 

Tables  23,  24,  25,  and  26  show  which  effects  from  Cases  1  through  4’s  four  different 
noise  level  scenarios  were  properly  identified,  incorrectly  identified  (Type  I  error),  and 
not  identified  (Type  II  error),  for  both  nine-factor  designs. 
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Table  21.  FBBD  Stepwise  Backward  Elimination  Results:  Case  4 


Scenario 

£  ~  1V(0, 1) 

£  ~  1V(0,  2) 

e  ~  1V(0,  3) 

£  ~  IV (0,  5) 

Step 

Factors  Removed 

1 

F 

J 

F 

A 

2 

- 

F 

J 

G 

3 

- 

B 

- 

C 

Table  22.  DSD  Forward  Stepwise  Results:  Case  4 


Scenario 

£  ~  JV(0, 1) 

£  ~  A(0,2) 

£  ~  jV(0,  3) 

£  ~  A(0,  5) 

Step 

Effects  Added 

1 

GH 

GH 

GH 

F2 

2 

AH 

AE 

AH 

AC 

3 

AF 

EG 

DE 

BG 

4 

EF 

HJ 

AD 

CF 

5 

G2 

E2 

DG 

BF 

6 

AC 

J2 

FH 

E 

7 

DF 

- 

A2 

D 

8 

J 

- 

- 

- 

Table  23.  Second  Order  Screening  Design  Results:  Case  1 


Strong  Heredity,  Factor  Sparsity  Model: 

2 A  -  1.5 E  +  2 G  -  3A2  +  2.5 E2  -  4 G2  +  4AE  +  3.5AG  -  5 EG  +  £ 

Scenario 

DSD 

FBBD 

£  ~  A(0, 1) 

Identified 

a,e,g:g2,ae,ag:eg 

A,  E,  G,  A2,  E2,  G2,  AE,  AG,  EG 

Type  I  errors 

G,  D,  F,  H,  J,  AD,  AJ,  DH 

B,  C,  D,  J,  B2,  C2,  J2, 

AB,  AC,  AD,  AF,  AJ,  BE,  BF, 
BG,  CD,  CE,  CF,  DE,  DF,  DG, 
DJ,  EJ,  FG,  F J 

Type  II  errors 

A2,  E2 

NONE 

£  ~  A(0,2) 

Identified 

A,  E,  G,  AE,  AG,  EG 

A,  E,  G,  A2,  E2,  G2,  AE,  AG,  EG 

Type  I  errors 

J,  EJ 

H2,  CD,  DG 

Type  II  errors 

A2,  E2,  G2 

NONE 

£  ~  iV(0,3) 

Identified 

A,  E,  G,  AE,  AG,  EG 

E,  G,  E2,  G2,  AE,  AG,  EG 

Type  I  errors 

D,F,H,  H2,DF 

B2,  D2,  AH,  BG,  DG 

Type  II  errors 

A2,  E2,  G2 

A,  A2 

£  ~  iV(0,5) 

Identified 

A,  E,  G,  AG 

G2,  AG 

Type  I  errors 

C,  D,  F,  H,  J,  F2,  CH,  EJ 

AJ 

Type  II  errors 

A2,  E2,  G2,  AE,  EG 

A,  E,  G,  A2,  E2,  AE,  EG 
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Table  24.  Second  Order  Screening  Design  Results:  Case  2 


Strong  Heredity,  Effect  Sparsity  Model: 

2 A  -  1.5 E  +  2G  +  AC  -  3H  +  2.5 E2  -  5 CG  +  3.5 EH  -  AGH  +  £ 

Scenario 

DSD 

FBBD 

£  ~  N(0, 1) 

Identified 

A,  C,  E,  G,  H,  E2,  CG 

EH,  GH 

A,  C,  E,  G,  E2,  CG 

Type  I  errors 

D,  F,  J,  J2,  CF ,  DH 

B,  H,  J,  A2,  B2,  C2,  G2,  J2,  AB, 

AC,  AD,  AE,  AF,  AJ,  BD,  BE, 

BF,  BG,  BJ,  CD,  CE,  CF,  DE,  DF, 
DG,  DJ,  EG,  EJ,  FG,  FJ,  GJ 

Type  II  errors 

NONE 

H,  EH,  GH 

£  ~  iV(0,2) 

Identified 

A,  C,  G,  H ,  CG,  GH 

A,  C,  E,  G,  H,  E2,  CG,  EH,  GH 

Type  I  errors 

J,  C'2,  CJ,  GJ 

D,  A2,  H2,  AC,  CD,  DG 

Type  II  errors 

E,  E2,  EH 

NONE 

£  ~  iV(0, 3) 

Identified 

A,  C,  E,  G,  H,  E2 

C,E,G,H,  E2,  CG,  EH,  GH 

Type  I  errors 

D,  F,  A2,  C2,  AF,  DG,  DH 

A2,  B2,  D2,  G2,  AH,  BG,  DG 

Type  II  errors 

CG,  EH,  GH 

A 

£  ~  iV(0, 5) 

Identified 

A,  E,  G,  C,  H,  EH 

C,  H,  CG 

Type  I  errors 

D,  F,  AE,  AG,  CE,  CF 

AJ,  F J 

Type  II  errors 

E2,  CG,  GH 

A,  G,  E,  E2,  EH,  GH 

Table  25.  Second  Order  Screening  Design  Results:  Case  3 


Weak  Heredity,  Factor  Sparsity  Model: 

2A  +  2E  -  1.5A2  +  2.5 E2  -  3.5AE  +  4 AG  -  5 EG  +  e 

Scenario 

DSD 

FBBD 

£  ~  iV(0, 1) 

Identified 

A,E,A2,E2,AE 

A,  E,  A2,  E2,  AE,  AG,  EG 

Type  I  errors 

B,  D,  F,  H,  J, 

J2,  BF,  DJ,  FH 

B,  C,  D,  G,  J,  B2,  C2, 

G2,  J2,  AB,  AC,  AD,  AF,  AJ, 

BE,  BF,  BG,  CD,  CE,  CF,  DE, 
DF,  DG,  DJ,  EJ,  FG,  FJ 

Type  II  errors 

AG,  EG 

none 

£  ~  iV(0, 2) 

Identified 

A,E,E2,AE 

A,  E,  E2,  AE,  AG,  EG 

Type  I  errors 

C,  H,  J,  J2,  CH,  EJ 

H2,  CD,  DG 

Type  II  errors 

A2,  AG,  EG 

~A2 

£  ~  iV(0,3) 

Identified 

A,E,AE,AG,EG 

E,  E2,  EG 

Type  I  errors 

D,F,G,H,  AF,  DF 

DG 

Type  II  errors 

A2,  E2 

A,A2,AE,AG 

£  ~  iV(0,5) 

Identified 

A,  E,  E2,  AG 

A,  AG 

Type  I  errors 

D,F,G,H,  F2,  DE,  FH 

F2,  AD,  DF,  DG 

Type  II  errors 

A2,  AE,  EG 

E,A2,E2,AE,EG 
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Table  26.  Second  Order  Screening  Design  Results:  Case  4 


Weak  Heredity,  Effect  Sparsity  Model: 

2 A  -  1.5E  +  2G  +  2.5 E2  -  3 H2  +  4 AC  -  5 CG  +  3.5 EH  -  AGH  +  e 

Scenario 

DSD 

FBBD 

£  ~  N(0, 1) 

Identified 

A,  E,  G,  AC,  GH 

A,  E,  G,  E2,  H2, 

AC ,  CG,  EH,  GH 

Type  I  errors 

C.  D.F,H.  J,  G'\ 

AF,  AH ,  DF,  EF 

B,  C,  D,  H,  J,  B2,  C2, 

g2,j2,ab,ad,ae,aj, 

BD,  BE,  BG,  BH,  CD,  DE, 

DG,  DH,  DJ,  EG,  EJ,  HJ 

Type  II  errors 

E2,  H2,  CG,  EH 

none 

e  ~  N(0, 2) 

Identified 

A,  E,  G,  E2,  GH 

A,  E,  G,  E2,  H2, 

AC,  CG,  EH,  GH 

Type  I  errors 

H,  J,  J2,  AE ,  EG,  HJ 

A2,  CD,  DG 

Type  II  errors 

H2,  AC,  CG,  EH 

none 

£  ~  iV(0,3) 

Identified 

A,  E,  G,  GH 

E,  G,  E2,  H2,  CG,  EH,  GH 

Type  I  errors 

D,F,H,  A2,  AD,  AH, 

DE,  DG,  FH 

A2,  B2,  D2,  G2,  AH,  BG,  DG 

Type  II  errors 

E2,  H2,  AC ,  CG,  EH 

A,  AC 

£  ~  iV(0, 5) 

Identified 

A,  E,  G,  AC 

E,H2 

Type  I  errors 

B,C,D,F,F2,BF,BG,  CF 

B,  D,  E,  H,  J,  B  2,  F  2, 

H2,  AB,  AD,  AE,  BD,  BF,  BH, 
BJ,  DE,  DF,  EF,  FH,  FJ,  HJ 

Type  II  errors 

E2,  H2,  CG,  EH,  GH 

A,  G,  E2,  AC,  CG,  EH,  GH 
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As  expected  for  both  the  FBBD  and  DSD,  Type  II  errors  increased  in  all  four 
cases  as  the  noise  level  increased.  However,  whereas  the  increase  in  the  number  of 
Type  II  errors  across  the  cases  for  the  DSD  were  on  the  order  of  1  to  3,  the  increase  in 
Type  II  errors  for  the  FBBD  were  far  greater,  from  5  to  7,  suggesting  the  DSD  may 
be  more  robust  to  noise  effects.  Unfortunately,  the  DSD  did  not  exhibit  robust  results 
when  it  came  to  whether  or  not  weak  or  strong  heredity  and  factor  or  effect  sparsity 
assumption  held.  When  comparing  strong  heredity  to  weak  heredity  for  DSD,  the 
DSD  performed  better  when  strong  heredity  was  exhibited,  particularly  when  effect 
sparsity  was  present.  Similarly,  when  comparing  factor  sparsity  to  effect  sparsity  for 
DSD,  the  DSD  performed  better  when  factor  sparsity  was  exhibited,  particularly  when 
strong  heredity  was  present.  Overall,  this  is  not  surprising  as  the  analysis  method 
for  DSD  forces  a  strong  heredity  model  and  the  DSD  has  more  power  in  determining 
active  effects  when  fewer  factors  are  active  Jones  and  Nachtsheim  (2011a).  The  DSD 
performance  is  inferior  to  the  FBBD,  in  terms  of  Type  II  errors,  at  the  lower  noise 
levels,  Scenarios  1  and  2,  in  all  but  Case  2  which  represented  strong  heredity  and 
effect  sparsity.  However,  in  all  but  one  scenario  in  one  case  (Case  2,  Scenario  2), 
the  Type  II  errors  were  limited  to  mostly  pure  quadratic  effects  and  a  few  two-factor 
interactions.  This  result  carries  across  all  case  and  scenario  combinations  for  the 
DSD  and  is  likely  a  by-product  of  the  design  which  focuses  on  main  effects  which  are 
unbiased  by  any  second-order  effect  (pure  quadratic  or  two-factor  interaction)  and 
where  second-order  effects  have  some  correlation  but  are  not  completely  confounded 
with  other  second-order  effects. 

In  contrast,  the  FBBD  has  no  discernible  pattern  in  Type  II  errors.  This  implies 
the  FBBD  is  just  as  likely  to  miss  important  main  effects  as  second-order  effects, 
especially  at  the  higher  noise  levels.  However  at  the  lower  noise  levels,  Scenarios  1 
and  2,  the  FBBD  made  only  a  few  Type  II  errors.  In  so  doing,  the  FBBD  consistently 
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over  specified  a  model,  particularly  at  the  lowest  noise  level,  Scenario  1.  With  regards 
to  robustness  to  heredity  and  sparsity,  the  FBBD  performed  equally  well  for  weak  and 
strong  heredity  and  factor  and  effect  sparsity,  excluding  Case  2.  However,  even  with 
nearly  double  the  number  of  runs,  the  FBBD  is  susceptible  to  excluding  an  active 
main  effect  during  the  initial  stages  of  design  analysis  based  upon  RMSE,  as  noise 
level  increases  as  evident  by  Case  2,  Scenario  1. 

Table  27  displays  average  results  of  all  active  effects,  second-order  effects  and  pure 
quadratic  effects  correctly  identified  based  on  five  independent  replications  for  all  four 
cases  considered  and  for  each  level  of  random  noise.  Clearly,  the  larger  FBBD  does 
quite  well  compared  to  DSD,  but  its  performance  degrades  as  noise  levels  increase. 
The  smaller  DSD  does  reasonably  well  under  strong  heredity  but  does  seem  to  struggle 
finding  the  interactions  and  quadratic  effects. 

3.5  Conclusions 

Regardless  of  the  heredity  (weak  or  strong),  sparsity  (effect  or  factor),  and  noise 
level  combination,  the  DSD  is  robust  in  its  ability  to  correctly  identify  active  main 
effects.  At  lower  noise  levels,  the  DSD  performs  favorably  in  identifying  active  two- 
factor  interactions  but  as  the  noise  level  increases  the  DSD  performance  suffers.  Ad¬ 
ditionally,  regardless  of  case  or  scenario,  the  DSD  struggled  finding  active  quadratic 
effects.  However,  if  the  experimenter  has  prior  knowledge  regarding  the  importance 
of  second-order  effects,  especially  pure  quadratic  effects,  and  wishes  to  maintain  the 
requirement  for  a  single  design  without  follow  up  design  runs,  augmenting  the  DSD 
could  reduce  the  correlation  between  a  factor’s  second-order  effect  without  sacrificing 
too  much  in  the  way  of  design  run  efficiency.  For  instance,  within  many  physical  mod¬ 
els  of  complex  aerodynamic  systems,  a  quadratic  “velocity”  factor  is  often  present. 
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Table  27.  Second  Order  Screening  Design  Results:  Average 


Strong  Heredity,  Factor  Sparsity  Model: 

5  Rep  Avg 

Scenario 

DSD 

FBBD 

£  ~  iV(0, 1) 

Identified 

67%,  50%,  20% 

100%,  100%,  100% 

e  ~  1V(0,2) 

Identified 

58%,  37%,  20% 

96%,  97%,  100% 

£  ~  1V(0,3) 

Identified 

51%,  33%,  27% 

80%,  90%,  100% 

£  ~  1V(0,5) 

Identified 

49%,  23%,  0% 

53%,  57%,  67% 

Strong  Heredity,  Effect  Sparsity  Model: 

5  Rep  Avg 

Scenario 

DSD 

FBBD 

£  ~  iV(0, 1) 

Identified 

98%,  95%,  80% 

91%,  90%,  100% 

£  ~  1V(0,2) 

Identified 

78%,  55%,  20% 

96%,  100%,  100% 

£  ~  1V(0,3) 

Identified 

84%,  70%,  40% 

62%,  65%,  100% 

£  ~  iV(0, 5) 

Identified 

69%,  30%,  0% 

53%,  50%,  40% 

Weak  Heredity,  Factor  Sparsity  Model: 

5  Rep  Avg 

Scenario 

DSD 

FBBD 

£  ~  1V(0, 1) 

Identified 

80%,  72%,  50% 

100%,  100%,  100% 

£  ~  1V(0,2) 

Identified 

60%,  44%,  20% 

86%,  80%,  60% 

£  ~  1V(0, 3) 

Identified 

51%,  32%,  0% 

69%,  64%,  80% 

£  ~  iV(0, 5) 

Identified 

60%,  44%,  20% 

46%,  48%,  40% 

Weak  Heredity,  Effect  Sparsity  Model: 

5  Rep  Avg 

Scenario 

DSD 

FBBD 

£  ~  1V(0, 1) 

Identified 

49%,  23%,  0% 

96%,  97%,  100% 

£  ~  iV(0,2) 

Identified 

47%,  23%,  10% 

91%,  93%,  90% 

£  ~  iV(0,3) 

Identified 

44%,  20%,  10% 

76%,  77%,  100% 

£  ~  iV(0,5) 

Identified 

51%,  27%,  30% 

53%,  50%,  60% 

Note:  Identified  percentages  correspond  to  percentage  of  active  effects, 


second-order  effects,  and  pure  quadratic  effects. 
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The  FBBD  has  no  discernible  pattern  in  Type  II  errors.  This  implies  the  FBBD  is 
just  as  likely  to  miss  important  main  effects  as  second-order  effects.  Thus,  modifica¬ 
tions  to  the  FBBD  design  to  ensure  at  least  main  effect  Type  II  errors  are  eliminated 
are  not  readily  apparent.  In  addition,  since  the  FBBD  over  specifies  models,  as  in¬ 
dicated  by  the  large  number  of  Type  I  errors  particularly  at  lower  noise  levels,  it 
seems  further  fractionation  of  the  FBBD  is  possible,  which  can  reduce  design  run  size 
requirements  without  sacrificing  Type  II  error  performance. 

Each  design  was  examined  using  the  authors’  recommended  analysis  method.  Em¬ 
ploying  different  analysis  methods  may  yield  improved  performance  of  the  design.  For 
instance,  it  is  possible  analyze  the  FBBD  with  the  forward  stepwise  regression  method 
used  on  the  DSD. 

Whenever  a  screening  design  is  employed,  analytical  tradeoffs  must  be  accepted. 
Overall,  both  designs  performed  in  the  environment  to  which  they  were  designed. 
The  DSD  is  run  size  efficient  when  strong  heredity  and  factor  sparsity  are  present 
and  when  few  second-order  effects  are  active.  In  contrast,  the  FBBD  diminishes  the 
importance  of  the  heredity  and  sparsity  assumption  but  at  the  cost  of  additional 
design  runs.  Depending  upon  subject  matter  expertise  regarding  a  system  under 
study,  selection  or  modification  of  one  or  both  of  the  designs  could  certainly  be  useful 
within  many  high-technology  industries. 
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IV.  Augmentation  of  Definitive  Screening  Designs  (DSD+) 


4.1  Introduction 

Response  surface  methodology  (RSM)  focuses  on  approximating  a  real  world  sys¬ 
tem  response  typically  with  either  a  first-order  or  second-order  polynomial  model. 
While  the  choice  of  experimental  designs  for  first-order  models  is  fairly  straight  for¬ 
ward  depending  upon  the  shape  of  the  experimental  design  region  and  number  of 
available  experimental  runs,  choosing  an  experimental  design  to  fit  a  second-order 
model, 


k  k  k— 1  k 

rj  =  A)  +  ^2  PiXi  +  ^2  PiiXi  +  ^2  X]  Pi3X*Xj’  (4-1) 

2—1  i= 1  i—  1  j=i-\- 1 

is  more  complex  due  to  the  variety  of  design  criteria  and  characteristics  to  consider. 

Usually,  the  experimenter  does  not  have  a  priori  knowledge  regarding  the  appro¬ 
priate  polynomial  model  to  use  to  approximate  the  system  response.  As  such  it  is 
common  practice  in  RSM  to  employ  experiments  sequentially.  Box  and  Liu  (1999) 
illustrated  the  RSM  philosophy  of  sequential  learning  where  first-order  designs  are 
typically  used  to  perform  factor  screening  and  second-order  designs  are  used  to  fit  a 
response  surface  exhibiting  some  degree  of  curvature.  Since  the  a  posteriori  knowledge 
about  a  system  response  possessing  curvature  comes  from  analysis  of  the  first-order 
design,  the  typically  sequential  nature  of  RSM  allows  developing  second-order  designs 
by  augmenting  first-order  designs  with  additional  experimental  runs. 

Whether  due  to  time,  budget,  or  other  constraints,  there  are  times  when  conduct¬ 
ing  multiple  experiments  is  unrealistic.  For  instance,  Lawson  (2003)  points  out  fixed 
deadlines  for  scale  up  and  production  of  prototype  engineering  designs  may  not  allow 
the  possibility  of  follow-up  experimentation.  Couple  this  with  the  fact  that  military 


systems,  particularly  aerodynamic  systems,  are  complex  and  often  exhibit  nonlinear 
behavior,  there  are  times  when  a  single  experimental  design  capable  of  performing 
both  factor  screening  and  higher  order  response  surface  exploration  may  be  required. 

Recent  literature  has  proposed  second-order  screening  design  methodologies,  some¬ 
times  referred  to  as  One-Step  RSM  or  Definitive  Screening,  employing  a  single  exper¬ 
imental  design  capable  of  both  factor  screening  and  fitting  a  second-order  polynomial 
model. 

Edwards  and  Truong  (2011)  preformed  a  simulation  study  examining  several 
second-order  screening  designs  focusing  on  the  design’s  ability  to  correctly  identify 
active  factors  under  a  variety  of  conditions.  The  truth  models  used  assumed  both 
factor  sparsity  and  strong  effect  heredity. 

Sparsity  and  heredity  are  two  important  principles  considered  during  the  devel¬ 
opment  of  successful  screening  designs.  The  sparsity  principle  stems  from  the  Pareto 
principle  which  has  led  to  an  assumption  in  screening  designs  that  only  a  small  num¬ 
ber  of  factors,  factor  sparsity,  are  significant  in  their  contribution  to  an  appropriate 
polynomial  model  approximation  of  a  system  response.  However,  the  degree  to  which 
factor  sparsity  holds  as  the  number  of  factors  being  investigated  grows  has  been  de¬ 
bated.  The  term  effect  sparsity  has  been  used  to  identify  with  the  assumption  that 
instead  of  the  number  of  active  factors  being  relatively  small  in  the  polynomial  model 
approximation,  the  number  of  active  effects  is  relatively  small.  As  a  result,  it  is 
possible  for  the  assumption  of  effect  sparsity  to  hold  while  factor  sparsity  does  not. 

Heredity,  either  strong  or  weak,  is  another  screening  principle  considered  during 
model  selection.  Strong  heredity  means  that  if  a  model  includes  a  two-factor  interac¬ 
tion,  then  both  its  constituent  main  effects  are  also  included  in  the  model.  Conversely, 
weak  heredity  requires  only  one  of  the  two  constituent  main  effects  be  included  in  the 
model. 
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Dougherty  et  al.  (2013b)  examined  the  robustness  of  Definitive  Screening  Designs 
(DSD)  and  Fractional  Box-Behken  Designs  (FBBD),  two  second-order  screening  de¬ 
signs,  with  respect  to  the  assumptions  of  sparsity  (factor  or  effect)  and  heredity 
(strong  or  weak).  Dougherty  et  ah  (2013b)  showed  that  regardless  of  the  heredity 
(weak  or  strong),  sparsity  (effect  or  factor),  or  noise  level  combination,  the  DSD  is 
robust  in  its  ability  to  correctly  identify  active  main  effects.  At  lower  noise  levels, 
the  DSD  performs  favorably  in  identifying  active  two-factor  interactions  but  as  the 
noise  level  increases  the  DSD  performance  suffers.  Additionally  the  DSD  had  trouble 
identifying  active  pure  quadratic  effects  when  two-factor  interactions  are  present.  As 
a  result,  if  the  experimenter  has  a  priori  knowledge  regarding  the  importance  of  a 
particular  factor,  or  that  factor’s  second-order  effects,  augmentation  of  the  DSD  could 
reduce  the  correlation  between  a  factors’  second-order  effects  without  sacrificing  too 
much  in  the  way  of  design  run  efficiency  while  maintaining  the  requirement  for  a 
single  design.  Conversely,  if  the  experimenter  has  a  posteriori  knowledge  about  a 
particular  factor  or  factors’  second-order  effects,  augmenting  the  DSD  demonstrates 
the  feasibility  of  follow-up  design  runs  for  DSD. 

The  remainder  of  this  paper  is  organized  as  follows,  Section  4.2  briefing  discusses 
the  literature  relevant  to  second-order  screening  designs  while  Section  4.3  focuses 
on  the  Definitive  Screening  Designs  generation  and  augmentation.  In  Section  4.4, 
we  present  a  side-by-side  comparison  of  the  Definitive  Screening  Design  examined  in 
Dougherty  et  al.  (2013b)  with  an  augmented  design  focusing  on  improved  robustness 
to  the  assumptions  of  heredity  and  sparsity  and  significant  second-order  factor  iden¬ 
tification.  Section  4.5  examines  the  effect  of  replicating  the  analysis  on  the  designs’ 
ability  to  identify  important  factors  of  interest  and  Section  4.6  concludes  the  article. 
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4.2  Second  Order  Screening  Designs 


Initial  attempts  at  identifying  second-order  screening  designs  relied  upon  the  de¬ 
sign’s  projection  capacity.  When  the  factor  sparsity  principle  holds  any  regular  frac¬ 
tional  factorial  design  of  resolution  R ,  projects  onto  any  subset  of  R  —  1  factors  as  a 
full  factorial.  For  example,  a  2'jJj  design  ( R  =  3)  can  project  into  a  22  design  (Myers 
and  Anderson- Cook,  2009).  This  projection  property  extends  to  nonregular  designs 
like  the  Plackett-Burman  designs  discussed  in  Lin  and  Draper  (1992)  and  Wang  and 
Wu  (1995). 

Cheng  and  Wu  (2001),  hereafter  referred  to  as  CW,  studied  three  orthogonal  array 
(OA)  designs  (OA(18,37),  OA( 27, 38),  and  OA(36,312)).  The  OA(N:3k )  connotation 
shows  the  design’s  number  of  runs  N  and  number  of  factors  k.  In  contrast  to  3n~k 
designs  which  have  defining  contrast  subgroups  to  describe  the  design  structure,  the 
OA(N,  3k )  designs  studied  by  CW  required  computer  search  to  classify  the  possible 
projected  designs. 

Because  a  design  can  project  onto  many  different  combinations  of  factors,  CW 
developed  a  projection-efficiency  criterion  to  compare  designs  based  upon  (1)  the 
number  of  eligible  projected  designs  and  (2)  the  estimation  efficiency  for  eligible 
projected  designs  determined  by  the  ratio  of  each  designs  D—  and  G— efficiences 
(Cheng  and  Wu,  2001).  Eligible  designs  are  designs  to  fit  a  second-order  model  and 
the  D—  and  G— efficiences,  denoted  Deff  and  Ge// ,  respectively,  criteria  compare  the 
performance  of  a  design  against  a  corresponding  optimal  design  (Myers  and  Anderson- 
Cook,  2009). 

Under  the  assumptions  of  factor  sparsity  and  strong  heredity,  CW  introduced 
a  two-stage  analysis  method.  The  first  stage  consisted  of  performing  a  main  effect 
factor  screening  analysis  and  the  second  stage  involved  fitting  a  second-order  model 
with  the  identified  main  effects  from  the  first  stage.  The  key  linkage  between  stage 
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one  and  two  was  the  ability  to  project  the  initial  larger  factor  space  onto  a  smaller 
factor  space  capable  of  fitting  a  second-order  model.  Unfortunately,  the  designs  CW 
studied  have  no  guarantee  as  to  their  ability  to  project  down  to  a  specific  subset  of 
the  original  factors  and  no  flexibility  in  modifying  the  number  of  design  runs. 

Improving  on  the  designs  of  CW,  Xu  et  al.  (2004),  hereafter  referred  to  by  XCW, 
proposed  a  combinatorial  method  for  constructing  new  and  efficient  OA  designs  and  a 
design  selection  approach  based  upon  a  projection  aberration  criterion  which  combines 
the  generalized  word-length  pattern  of  the  generalized  minimum  aberration  criterion 
(Xu  and  Wu,  2001)  for  factor  screening  and  the  projection-efficiency  criteria  (Cheng 
and  Wu,  2001)  for  interaction  detection.  XCW  assessed  the  projection  performance 
of  three  combinatorially  non-isomorphic  0/1(18, 3 ')s  and  three  combinatorially  non¬ 
isomorphic  0/1(27,  313)s.  Their  three-step  approach  involves:  (1)  screening  out  poor 
orthogonal  arrays  for  factor  screening  using  the  generalized  word-length  pattern,  (2) 
applying  the  projection  aberration  criterion  to  select  a  best  design  from  step  1,  and 
(3)  determining  the  best  level  permutations  of  the  design  from  step  2  to  improve  de¬ 
sign  projection  eligibility  and  estimation  efficiency  under  the  second-order  polynomial 
model. 

Ye  et  al.  (2007),  hereafter  referred  to  as  YTL,  also  examined  3-level  18-run  and 
27-run  orthogonal  designs;  however,  in  addition  to  considering  the  projection  proper¬ 
ties  of  designs,  their  design  choices  were  based  on  both  model  estimation  and  model 
discrimination  criteria.  The  two  model  estimation  criteria  employed  examine  the  pro¬ 
portion  of  estimable  models,  Estimation  Capacity  (EC),  and  average  D— efficiency  of 
all  models,  Information  Capacity  (IC).  YTL  employed  two  of  the  six  non-Bayesian 
criteria,  Average  Expected  Prediction  Differences  (AEPD)  and  Minimum  Maximum 
Prediction  Difference  (MMPD),  proposed  by  Jones  et  al.  (2007)  for  model  discrimi¬ 
nation. 
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While  previous  work  focused  primarily  on  the  designs  projection  capacity,  Ed¬ 
wards  and  Truong  (2011)  applied  Jones  and  Nachtsheim  (2011b)  method  for  finding 
efficient  designs  with  minimal  aliasing  between  main  effects  and  two-factor  interac¬ 
tions.  Deemed  MA  designs,  Edwards  and  Truong  (2011)  constructed  18,  27,  and 
30-run  designs  for  simultaneous  screening  and  response  surface  optimization  for  k  = 
4  to  7,  k  =  4  to  13,  and  k  =  6  to  14  factors,  respectively,  by  minimizing  the  sum 
of  squares  of  the  elements  of  the  alias  matrix,  A,  subject  to  a  lower  bound  on  the 
primary  model  D— efficiency.  Edwards  and  Truong  (2011)  compared  the  27-run  OAs 
of  XCW  and  YTL  with  MA  designs  in  terms  of  D-efRciency  of  projection  and,  via 
a  simulation  study,  the  proportion  of  active  factors  declared  significant  (Power  1)  as 
well  as  the  proportion  of  simulations  in  which  only  the  true  active  factors  are  declared 
significant  (Power  2).  Although  ranked  last  in  terms  of  D-efficiency,  the  MA  designs 
showed  superior  performance  in  their  ability  to  detect  active  factors  (Edwards  and 
Truong,  2011). 

For  simplicity,  the  CW,  XCW,  YTL,  and  MA  designs  use  linear  and  quadratic 
main-effects  only  analysis  for  factor  screening  but  the  Bayesian  approaches  of  Box 
and  Meyer  (1993)  or  Chipman  et  al.  (1997)  can  also  be  used  to  screen  for  significant 
factors  outside  of  main  effects.  However,  these  methods  are  not  readily  available  in 
statistical  software  packages  and  are  computationally  intensive  procedures,  thus  likely 
making  their  use  impractical  (Edwards  and  Truong,  2011).  Unfortunately,  as  shown 
by  Truong  (2010),  if  the  strong  heredity  principle  fails  to  hold  important  effects  can 
be  missed  leading  to  a  misspecified  second-order  polynomial  model. 

Edwards  and  Mee  (2011)  introduced  the  spherical  FBBD  aimed  at  overcoming 
the  projection  deficiencies  and  main/quadratic  effect  only  analysis  issues  found  in 
the  CW/XCW/YTL/MA  designs.  The  FBBD  provide  the  ability  to  explore  inter¬ 
actions  during  the  screening  stage  and  to  fit  second-order  models  via  a  backward 
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elimination  analysis  strategy  to  each  of  the  (k  —  l)-factor  projections.  In  contrast  to 
the  CW/XCW/YTL/MA  designs,  Edwards  and  Mee  (2011)  assumed  an  effect  spar¬ 
sity  vice  factor  sparsity  model  and  searched  for  designs  having  larger  factor  eligible 
projections  than  the  CW/XCW/YTL/MA  designs  by  taking  subsets  of  the  two-level 
fractional  factorial  designs  which  compose  a  BBD.  While  FBBDs  require  more  runs 
than  CW/XCW/YTL/MA  designs,  their  ease  of  construction  and  aliasing  structure 
facilitate  an  analysis  strategy  which  cannot  be  applied  to  the  CW/XCW/YTL/MA 
designs. 

Jones  and  Nachtsheim  (2011a)  introduced  a  class  of  three-level  designs  referred  to 
as  Definitive  Screening  Designs  (DSD)  where  main  effects  are  not  biased  by  second- 
order  effects  and  all  quadratic  effects  are  estimable.  For  k  >  6,  the  DSD  can  project 
down  to  a  full  quadratic  model  in  any  three  factors. 

4.3  Definitive  Screening  Design  Augmentation 

Jones  and  Nachtsheim  (2011a)  used  a  computerized  search  algorithm  to  create 
the  DSD,  with  2k  +  1  runs  to  investigate  k  factors.  The  DSD  consist  of  k  fold-over 
pairs  for  k  factors  and  a  single  center  point.  The  search  algorithm  forces  each  run 
to  maintain  a  single  factor  at  its  center  point  while  forcing  the  remaining  factors 
to  their  extremes  (±1).  The  DSD  is  constructed  using  a  variant  of  the  coordinate 
exchange  algorithm  of  Meyer  and  Nachtsheim  (1995)  to  maximize  the  determinant 
of  the  information  matrix  of  the  main  effects  model  while  maintaining  the  desired 
design  structure. 

To  guard  against  local  maxima,  Jones  and  Nachtsheim  (2011a)  use  multiple  ran¬ 
dom  starting  designs  for  each  fc-factor  design;  however  Xiao  et  al.  (2012)  demonstrate 
a  method  for  generating  global  optimum  DSD  for  an  even  value  of  k  through  the 
use  of  conference  matrices.  Table  28  shows  the  nine-factor  DSD  generated  by  JMP 
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10.  JMP  10  uses  the  conference  matrices  method  of  Xiao  et  al.  (2012)  even  when  k 
is  odd  by  producing  a  DSD  for  k  +  1  factors  and  removing  the  k  +  1  column  factor 
settings.  As  a  result  when  k  is  odd,  the  DSD  has  2k  +  3  runs.  When  k  is  even,  the 
DSD  maintains  the  2k +  1  number  of  runs  original  proposed  by  Jones  and  Nachtsheim 
(2011a). 

Table  28.  Nine-Factor  Definitive  Screening  Design  (DSD) 
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The  2k  +  1  or  2k  +  3  runs  for  when  k  is  even  or  odd,  respectively,  provide  a 
sufficient  number  of  degrees  of  freedom  for  estimates  of  the  intercept,  all  k  main 
effects,  and  all  k  pure  quadratic  effects.  However,  Dougherty  et  al.  (2013b)  showed 
when  both  two-factor  interactions  and  pure-quadratic  effects  are  active,  regardless  of 
heredity  (strong  or  weak)  or  sparsity  (factor  or  effect),  the  standard  DSD  may  not  have 
enough  degrees  of  freedom  to  decouple  the  correlation  between  two-factor  interactions 
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and  pure-quadratic  effects.  As  a  result,  the  DSD  when  used  as  a  single  experimental 
design  is  susceptible  to  making  Type-II  errors  particularly  with  regards  to  active 
pure-quadratic  effects.  Because  the  DSD  is  very  run  efficient  when  compared  to  other 
second-order  screening  designs,  augmenting  the  original  DSD  to  improve  detection  of 
active  quadratic  effects  (both  two-factor  interactions  and  pure-quadratic)  is  desirable. 

If  the  experimenter  has  a  priori  knowledge  regarding  the  importance  of  a  par¬ 
ticular  factor  or  factors’  second-order  effects,  augmentation  of  the  DSD,  hereafter 
referred  to  as  DSD+,  could  reduce  the  correlation  between  a  factor’s  second-order  ef¬ 
fects  without  sacrificing  too  much  in  the  way  of  design  run  efficiency  while  maintaining 
the  requirement  for  a  single  design.  Conversely,  if  the  experimenter  has  a  posteriori 
knowledge  about  a  particular  factor  or  factors’  second-order  effects,  augmenting  the 
DSD  demonstrates  the  feasibility  of  follow-up  design  runs  for  DSD. 

Common  approaches  to  design  augmentation  to  clarify  model  ambiguity  involves 
the  augmentation  of  the  design  with  runs  specifically  designed  to  de-alias  a  specific 
alias  chain  or  using  complete  or  fractional  foldovers  of  the  design.  Since  the  DSD 
are  basically  already  full  foldover  designs,  using  the  foldover  approach  on  DSD  does 
not  reduce  aliasing  between  second-order  effects.  Additionally,  the  alias  chains  for 
DSD  are  very  complex  due  to  the  nature  of  the  design  construction.  Therefore  an 
alternative  approach  using  a  D-optimal  strategy  for  selecting  augmentation  points  is 
employed. 

Similar  to  Jones  and  Nachtsheim  (2011a),  a  computerized  search  algorithm  is  used 
to  add  k  —  1  runs  to  the  DSD.  However,  instead  of  the  information  matrix  being  only 
a  main  effects  model,  the  information  matrix  contains  the  main  effects  and  the  k  —  1 
two-factor  interactions  involving  a  particular  factor.  The  DSD+  were  constructed 
using  a  variant  of  the  coordinate  exchange  algorithm  of  Meyer  and  Nachtsheim  (1995) 
to  maximize  the  determinant  of  the  updated  information  matrix.  Multiple  random 
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starting  designs  for  each  k— factor  design  were  explored  to  guard  against  local  maxima; 
however,  the  generated  designs  were  still  not  unique.  Multiple  designs  were  generated 
which  were  equivalent  based  upon  both  D- optimal  and  I-efficient  criteria;  although, 
as  k  increased  the  number  of  different  designs  decreased. 

Table  29  shows  the  k  —  9  factor  DSD  generated  by  JMP  10  plus  A:  —  1  =  8 
augmentation  runs  after  updating  the  information  matrix  to  include  the  8  two-way 
interactions  involving  factor  A. 

4.4  Case  Comparison 

Dougherty  et  ah  (2013b)  conducted  an  empirical  study  of  the  nine-factor  definitive 
screening  design  generated  using  conference  matrices  based  on  Xiao  et  al.  (2012) 
focusing  on  the  design’s  robustness  to  detect  important  effects  in  models  exhibiting 
different  combinations  of  heredity  and  sparsity.  Using  Jones  and  Nachtsheim  (2011a) 
recommended  analysis  methodology,  the  cases  and  scenarios  studied  are  reexamined 
using  the  DSD+. 

Jones  and  Nachtsheim  (2011a)  suggest  performing  a  forward  stepwise  regression, 
which  considers  all  terms  in  a  second-order  model  of  k  —  9  factors.  With  a  p-value 
of  0.1  to  enter,  effects  are  added  into  the  second-order  model  while  forcing  a  strong 
heredity  model.  As  such,  when  either  two-factor  interactions  or  pure-quadratic  effects 
are  included  in  the  model,  the  lower  order  terms  must  also  be  included. 

Four  cases  were  considered  to  represent  different  combinations  of  model  heredity 
(strong  or  weak)  and  sparsity  (factor  or  effect).  In  addition,  each  model  was  examined 
with  four  different  noise  levels  scenarios;  however,  the  noise  level  vector  used  for  each 
scenario  was  identical  across  each  model  for  each  design.  The  21  and  29  treatment 
combinations  for  the  DSD  and  DSD+  designs  are  given  in  Tables  28  and  29,  respec¬ 
tively.  Table  30  shows  the  simulated  response  values  for  the  16  combinations  of  case 
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Table  29.  Nine-Factor  Augmented  Definitive  Screening  Design  (DSD+) 
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and  noise  level  scenario  for  the  original  DSD  runs  and  the  eight  additional  runs  for 
the  DSD+. 

Case  1  data  was  simulated  based  on  the  model 


t/i  —  2Ai  —  1.5 Ei  +  2 Gi  —  3A2  +  2.5 E2  —  4 G2  +  AA^Ei  +  3.3AiGi  —  5 EiGi  +  £j,  (4.2) 

thereby  representing  a  response  which  exhibits  factor  sparsity  and  strong  heredity 
between  active  two-factor  interactions  or  pure  quadratic  effects  and  main  effects. 
The  model  exhibits  factor  sparsity  because  only  3  of  the  9  factors  are  active  within 
the  9  effects  contained  in  the  model. 

Jones  and  Nachtsheim  (2011a)  perform  forward  stepwise  regression  with  a  p- value 
of  0.1  to  enter  while  forcing  a  strong  heredity  model.  Table  31  shows  the  forward 
stepwise  regression  steps  for  the  Case  1  data  for  all  four  noise  level  scenarios  of  Table 
30. 

Since  the  “combined”  option  rule  is  used  for  the  forward  stepwise  regression,  the 
inclusion  of  two-way  interaction  or  pure  quadratic  effects  result  in  the  inclusion  of 
all  the  factors  which  comprise  the  two-way  interaction  or  pure  quadratic  effects.  For 
example,  when  considering  the  original  DSD  Scenario  3,  where  £*  ~  iV(0,  3),  the  EG 
and  H2  effects,  which  entered  the  regression  model  in  steps  1  and  5,  respectively, 
would  require  the  if,  G,  and  H  factors  to  also  be  in  the  model. 

Case  2  data  was  simulated  according  to  the  model 


Vi  =  2 A;  -  1.5 Ei  +  2 Gi  +  4C'  -  3 Hz  +  2.5 E2  -  AGiHi  +  3.5 E^  -  5 QGi  +  ei}  (4.3) 

to  represent  a  response  exhibiting  effect  sparsity  and  strong  heredity  between  active 
two-factor  interactions  or  pure  quadratic  effects  and  their  associated  main  effects. 
The  model  exhibits  effect  sparsity  vice  factor  sparsity  because  although  over  50%  of 
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Table  30.  Nine-Factor  Simulated  Response 
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Table  31.  Forward  Stepwise  Results:  Case  1 


Scenario 

e  ~  1V(0, 1) 

£  ~  N(0,2) 

£  ~  N( 0,3) 

Design 

DSD 

DSD+ 

DSD 

DSD+ 

DSD 

DSD+ 

Step 

Effects  Added 

1 

EG 

EG 

AG 

AG 

EG 

EG 

2 

AG 

AG 

EG 

EG 

AG 

AG 

3 

AE 

AE 

EJ 

AE 

AE 

AE 

4 

G2 

G2 

AE 

G2 

DF 

H2 

5 

AJ 

AJ 

- 

DF 

H2 

AH 

6 

DH 

GJ 

- 

CJ 

- 

BG 

7 

AD 

A2 

- 

- 

- 

DE 

8 

F 

AD 

- 

- 

- 

FH 

9 

C 

CH 

- 

- 

- 

AF 

10 

- 

- 

- 

- 

- 

B2 

the  factors  (5  of  9)  are  active  only  9  of  54  total  effects  are  active,  not  coincidentally 
the  same  number  as  Case  1. 

Table  32  provides  Jones  and  Nachtsheim  (2011a)  forward  stepwise  regression  using 
the  Case  2  response  data  associated  with  each  design  for  all  four  noise  level  scenarios 
in  Tables  30. 

Case  3  data  was  simulated  according  to  the  model 


—  2  A  +  2  Ei  —  1.5  A2  +  2.5-E2  —  3.5AA  +  4  AG*  —  5  AG*  +  Ei 


(4.4) 


thereby  representing  a  response  which  exhibits  factor  sparsity  and  weak  heredity 
between  active  two-factor  interactions  or  pure  quadratic  effects  and  main  effects. 
The  model  exhibits  factor  sparsity  because  only  3  of  the  9  factors  are  active  within 
the  7  effects  contained  in  the  model.  Since  not  all  factors,  which  comprise  the  two- 
factor  interactions,  are  present  as  a  main  effect,  the  model  exhibits  weak  heredity.  For 
instance,  although  factor  G  is  significant  within  two  two-factor  interactions,  factor  G 
by  itself  is  not  significant. 
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Table  32.  Forward  Stepwise  Results:  Case  2 


Scenario 

e  ~  1V(0, 1) 

e  ~  A(0,  2) 

e  ~  1V(0,  3) 

Design 

DSD 

DSD+ 

DSD 

DSD+ 

DSD 

DSD+ 

Step 

Effects  Added 

1 

CG 

CG 

C2 

CG 

C2 

CG 

2 

GH 

GH 

GH 

GH 

E2 

GH 

3 

EH 

EH 

CG 

EH 

DH 

EH 

4 

A 

A 

CJ 

AE 

A2 

A2 

5 

E2 

E2 

A 

H2 

DG 

DE 

6 

J 2 

J 2 

GJ 

DF 

AF 

E2 

7 

DH 

DH 

- 

DE 

- 

AH 

8 

CF 

CH 

- 

BJ 

- 

AE 

9 

- 

CD 

- 

- 

- 

CF 

10 

- 

CJ 

- 

- 

- 

- 

11 

- 

D2 

- 

- 

- 

- 

Table  33  provides  Jones  and  Nachtsheim  (2011a)  forward  stepwise  regression  using 
the  Case  3  response  data  associated  with  each  design  for  all  four  noise  level  scenarios 
in  Tables  30. 


Table  33.  Forward  Stepwise  Results:  Case  3 


Scenario 

e  ~  1V( 0, 1) 

£  ~  JV(0,2) 

£  ~  1V(0,3) 

Design 

DSD 

DSD+ 

DSD 

DSD+ 

DSD 

DSD+ 

Step 

Effects  Added 

1 

AE 

EG 

AE 

AE 

EG 

EG 

2 

BF 

AG 

CH 

AG 

AG 

AG 

3 

J 2 

AE 

EJ 

EG 

AE 

AE 

4 

A2 

E2 

J 2 

D2 

DF 

- 

5 

FH 

J 2 

E2 

AF 

H 

- 

6 

DJ 

CE 

- 

DF 

AF 

- 

7 

E2 

- 

- 

BE 

- 

- 

Case  4  data  was  simulated  according  to  the  model 


Vi  =  2  Ai 


1.5  Ei  +  2  (S',  —  3H2  +  2.5  E2  +  4  AC'  +  3.5  M 


5  C%  Gi 


4 GiHi  +  Ei  (4.5) 
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to  represent  a  response  which  exhibits  effect  sparsity  and  weak  heredity  between 
active  two-factor  interactions  or  pure  quadratic  effects  and  main  effects. 

Table  34  provides  Jones  and  Nachtsheim  (2011a)  forward  stepwise  regression  using 
the  Case  4  response  data  associated  with  each  design  for  all  four  noise  level  scenarios 
in  Tables  30. 


Table  34.  Forward  Stepwise  Results:  Case  4 


Scenario 

e  ~  IV  ( 0, 1) 

e  ~  iV(0,  2) 

e  ~  IV  ( 0,  3) 

Design 

DSD 

DSD+ 

DSD 

DSD+ 

DSD 

DSD+ 

Step 

Effects  Added 

1 

GH 

GH 

GH 

AC 

GH 

GH 

2 

AH 

CG 

AE 

CG 

AH 

CG 

3 

AF 

AC 

EG 

EH 

DE 

AC 

4 

EF 

EH 

HJ 

GH 

AD 

EH 

5 

G2 

J2 

E2 

AE 

DG 

DE 

6 

AC 

E2 

J2 

DF 

FH 

A2 

7 

DF 

H 2 

- 

BH 

A2 

E2 

8 

J 

EJ 

- 

- 

- 

AH 

9 

- 

FH 

- 

- 

- 

FJ 

10 

- 

CD 

- 

- 

- 

- 

11 

- 

EF 

- 

- 

- 

- 

12 

- 

DE 

- 

- 

- 

- 

Tables  35,  36,  37,  and  38  show  which  effects  from  Cases  1  through  4’s  four  different 
noise  level  scenarios  were  properly  identified,  incorrectly  identified  (Type  I  error),  and 
not  identified  (Type  II  error),  for  both  the  DSD  and  DSD+  based  upon  Jones  and 
Nachtsheim  (2011a)  suggested  analysis  methodology. 

In  all  four  Cases,  regardless  of  noise  level,  the  DSD+  performance  in  identifying 
active  effects  met  or  exceeded  the  DSD  performance.  However,  similar  to  the  DSD, 
the  DSD+  was  still  susceptible  to  increased  Type  II  errors  as  the  noise  level  increased. 
Fortunately,  the  DSD+  was  more  robust  to  the  heredity  (strong  or  weak)  or  sparsity 
(factor  or  effect)  assumption  than  the  DSD.  When  comparing  strong  heredity  to  weak 
heredity  for  DSD,  the  DSD  performed  better  when  strong  heredity  was  exhibited, 
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particularly  when  effect  sparsity  was  present.  In  contrast,  the  DSD+  performed 
equally  well  under  the  heredity  assumption.  With  regards  to  the  sparsity  assumption, 
the  DSD+  showed  better  performance  under  effect  sparsity  than  factor  sparsity  which 
was  counter  to  the  DSD.  However,  the  DSD+  performance  under  factor  sparsity 
assumption  was  still  better  than  the  DSD.  Interestingly,  all  the  Type  II  errors  across 
all  Scenarios  and  Cases  made  by  the  DSD+  involved  not  identifying  active  pure- 
quadratic  effects. 


Table  35.  Second  Order  Screening  Design  Results:  Case  1 


Strong  Heredity,  Factor  Sparsity  Model:  Rep  1 

2 A  -  1.5 E  +  2 G-  3A2  +  2.5 E2  -  4 G2  +  AAE  +  3.5 AG  -  5 EG  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  1V(0, 1) 

Identified 

A,  E,  G,  G2,  AE,  AG,  EG 

a,e,g,a2,g2,ae,ag,eg 

Type  I  errors 

C,D,F,H,J,AD,AJ,DH 

C,  .D,  H,  J,  AD,  AJ,  CH,  GJ 

Type  II  errors 

A2,  E2 

~w~ 

£  ~  1V(0, 2) 

Identified 

A,E,G,AE,AG,EG 

A,E,G1G2,AE1AG1EG 

Type  I  errors 

J,  EJ 

C,  D,  F,  J,  CJ ,  DF 

Type  II  errors 

A2,  E2,  G2 

A2,E2 

e  ~  1V(0,3) 

Identified 

A,  E,  G,  AE,  AG,  EG 

A,  E,  G,  AE,  AG,  EG 

Type  I  errors 

D,F,H,  H2,DF 

B2,D2,AH,BG,DG 

Type  II  errors 

A2,  E2,  G2 

A2,  E2,  G2 

4.5  Analysis  Replication  Results 

In  order  to  insure  the  increased  performance  exhibited  by  the  DSD+  over  the  DSD 
was  not  limited  to  a  single  instance,  the  response  data  was  replicated  four  additional 
times.  Table  39  displays  the  average  percentage  of  all  active  effects,  second-order 
effects,  and  pure-quadratic  effects  correctly  identified  from  five  replications  of  all  four 
Cases  and  three  Scenarios.  For  instance,  Case  3  (Weak  Heredity,  Factor  Sparsity 
Model),  Scenario  1  (e  ~  iV(0, 1))  shows  on  average  the  DSD  correctly  identified 
80%  of  the  active  effects  in  model,  72%  of  the  active  second-order  effects  (two-way 
interactions  and  pure-quadratic  effects),  and  50%  of  the  active  pure-quadratic  effects. 
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Table  36.  Second  Order  Screening  Design  Results:  Case  2 


Strong  Heredity,  Effect  Sparsity  Model:  Rep  1 

2 A  -  1.5E  +  2G  +  AC-3H  +  2.5 E2  -  5 CG  +  3.5 EH  -  AGH  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  iV( 0, 1) 

Identified 

A,  E,  C,  G,  H,  E2,  CG, 

EH,  GH 

A,  E,  C,  G,  H,  E2,  CG, 

EH,  GH 

Type  I  errors 

D,  F,  J,  J2,  CF,  DH 

D,  J,  D2,  J2,  CD,  CH,  CJ,  DH 

Type  II  errors 

NONE 

NONE 

£  ~  1V(0,2) 

Identified 

A,  C,  G,  H,  CG,  GH 

A,  E,  C,  G,  H,  CG,  EH,  GH 

Type  I  errors 

J,  C2,  CJ,  GJ 

B,  D,  F,  J,  H2,  AE,  BJ,  DE,  DF 

Type  II  errors 

E,  E2,  EH 

£  ~  iV(0, 3) 

Identified 

A,  E,  C,  G,  H,  E2 

A,  E,  C,  G,  H,  E2,  CG,  EH,  GH 

Type  I  errors 

D,  F,  A2,  C2,  AF,  DG,  DH 

D,F,A2,AE,AH,  CF,  DE 

Type  II  errors 

CG,  EH,  GH 

NONE 

Table  37.  Second  Order  Screening  Design  Results:  Case  3 


Weak  Heredity,  Factor  Sparsity  Model:  Rep  1 

2A  +  2E-  1.5 A2  +  2.5 E2  -  3.5 AE  +  4 AG  -  5 EG  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  !V(0, 1) 

Identified 

A,E,A2,E2,AE 

A,E,E2,AE,AG,EG 

Type  I  errors 

B,  D,  F,  H,  J, 

J2,  BF,  DJ,  FH 

C,  G,  J,  J2,  CE 

Type  II  errors 

AG,  EG 

~AT~ 

£  ~  1V(0, 2) 

Identified 

A,E,E2,AE 

A,  E ,  AE ,  AG ,  EG 

Type  I  errors 

C,  H,  J,  J2,  CH,  EJ 

B,  D,  F,  G,  D2,  AF,  BE,  DF 

Type  II  errors 

A2,  AG,  EG 

A2,  E2 

£  ~  1V(0,3) 

Identified 

A,E,AE,AG,EG 

A,E,AE,AG,EG 

Type  I  errors 

D,F,G,H,  AF,  DF 

G 

Type  II  errors 

A2,  E2 

A2,  E2 
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Table  38.  Second  Order  Screening  Design  Results:  Case  4 


Weak  Heredity,  Effect  Sparsity  Model:  Rep  1 

2 A  -  1.5E  +  2G  +  2.5 E2  -  3 H2  +  4 AC  -  5 CG  +  3.5 EH  -  AGH  +  e 

Scenario 

DSD 

DSD+ 

£  ~  N(0, 1) 

Identified 

A,  E,  G,  AC,  GH 

A,  E,  G,  E2,  H2,  AC,  CG,  EH,  GH 

Type  I  errors 

C,  D,  F,  H ,  J,  G2, 

AF,  AH,  DF,  EF 

C,D,F,  H,  J,  J2, 

CD,  DE,  EF,  EJ,  FH 

Type  II  errors 

E2,  H2,  CG,  EH 

NONE 

£  ~  iV(0,2) 

Identified 

A,  E,  G,  E2,  GH 

A,  E,  G,  AC,  CG,  EH,  GH 

Type  I  errors 

H,  J,  J2,  AE,  EG,  HJ 

B,  C,  D,  F,  H,  AE,  BH,  DF 

Type  II  errors 

H2,  AC,  CG,  EH 

E2,H2 

e  ~  iV(0,3) 

Identified 

A,  E,  G,  GH 

A,  E,  G,  E2,  AC,  CG,  EH,  GH 

Type  I  errors 

D,F,H,  A2,  AD,  AH, 
DE,DG,FH 

C,  D,  F,  H,  J,  A2, 

AH,  DE,  FJ 

Type  II  errors 

E2,  H2,  AC,  CG,  EH 

~TF~ 

Additionally,  Table  39  displays  the  average  number  of  Type  I  errors  made.  Overall, 
the  percentages  show  the  DSD+  outperforms  the  DSD  with  regards  to  identifying 
active  effects  and  their  various  subsets  across  the  board  with  little  to  no  increase  in 
Type  I  errors.  However,  when  the  noise  level  increases  neither  the  DSD  nor  the  DSD+ 
are  consistently  finding  the  active  pure-quadratic  effects.  The  individual  replication 
results  are  found  in  Tables  40  through  55  in  the  Appendix. 


4.6  Conclusions 

For  a  second-order  polynomial  model,  if  a  factor  screening  design  is  not  used,  a 
design  must  contain  enough  degrees  of  freedom  to  estimate  all  effects.  For  k  factors 
this  equates  to  O+iHk+2)  design  runs.  As  k  increases,  the  number  of  required  runs  will 
quickly  exceed  the  number  of  available  runs  provided  to  an  experimenter,  particularly 
within  the  DOD  testing  realm.  As  such  as  k  increases,  a  screening  design  must 
be  employed  while  maintaining  the  ability  to  estimate  a  second-order  polynomial 
model  when  constraints  dictate  a  single  experiment.  Jones  and  Nachtsheim  (2011a) 


Table  39.  Second  Order  Screening  Design  Results:  Average 


Strong  Heredity,  Factor  Sparsity  Model:  5  Rep  Avg 

Scenario 

DSD 

DSD+ 

£  ~  A(0, 1) 

Identified 

67%,  50%,  20% 

91%,  87%,  73% 

Type  I  errors 

9.6 

10.6 

e  ~  1V(0,2) 

Identified 

58%,  37%,  20% 

84%,  77%,  53% 

Type  I  errors 

7.4 

5.6 

£  ~  1V(0,3) 

Identified 

51%,  33%,  27% 

62%,  43%,  13% 

Type  I  errors 

6.4 

4.8 

Strong  Heredity,  Effect  Sparsity  Model:  5  Rep  Avg 

Scenario 

DSD 

DSD+ 

e  ~  A(0, 1) 

Identified 

98%,  95%,  80% 

98%,  95%,  80% 

Type  I  errors 

6.6 

7.6 

£  ~  A(0, 2) 

Identified 

78%,  55%,  20% 

91%,  80%,  20% 

Type  I  errors 

4.0 

5.4 

£  ~  A(0,3) 

Identified 

84%,  70%,  40% 

93%,  85%,  40% 

Type  I  errors 

4.2 

4.2 

Weak  Heredity,  Factor  Sparsity  Model:  5  Rep  Avg 

Scenario 

DSD 

DSD+ 

£  ~  A(0, 1) 

Identified 

80%,  72%,  50% 

94%,  92%,  80% 

Type  I  errors 

9.8 

10 

£  ~  A(0, 2) 

Identified 

60%,  44%,  20% 

77%,  68%,  20% 

Type  I  errors 

8.0 

7.0 

£  ~  A(0, 3) 

Identified 

51%,  32%,  0% 

77%,  68%,  20% 

Type  I  errors 

7.4 

4.6 

Weak  Heredity,  Effect  Sparsity  Model:  5  Rep  Avg 

Scenario 

DSD 

DSD+ 

£  ~  A(0, 1) 

Identified 

49%,  23%,  0% 

93%,  90%,  70% 

Type  I  errors 

10.6 

11.2 

£  ~  A(0,2) 

Identified 

47%,  23%,  10% 

84%,  77%,  30% 

Type  I  errors 

8.4 

7.0 

£  ~  A(0,3) 

Identified 

44%,  20%,  10% 

82%,  73%,  20% 

Type  I  errors 

7.6 

6.6 

Note:  Identified  percentages  correspond  to  percentage  of  active  effects, 
second-order  effects,  and  pure  quadratic  effects. 
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proposed  the  economical  three-level  DSD  for  screening  quantitative  factors  in  the 
presence  of  active  second-order  effects.  Dougherty  et  al.  (2013b)  showed  the  DSD 
were  effective  in  identifying  active  main  effects  regardless  of  the  heredity  and  sparsity 
assumption  but  lacked  the  power  to  differentiate  between  active  second-order  effects 
when  both  two-factor  interactions  and  pure-quadratic  effects  are  active.  We  introduce 
a  way  to  augment  the  DSD,  deemed  DSD+,  with  k  —  1  runs  which  increased  the 
detection  performance  of  active  second-order  effects  involving  a  particular  factor  of 
interested.  The  k  —  1  additional  runs  can  be  run  as  part  of  a  single  experiment  with 
the  original  DSD  if  the  experimenter  has  a  priori  knowledge  or  as  part  of  a  follow-on 
experiment  based  upon  a  posteriori  knowledge.  Furthermore,  while  the  additional 
runs  are  optimized  for  two-factor  interactions,  the  impact  of  adding  additional  center 
point  runs  on  identifying  active  pure-quadratic  effects  requires  further  investigation. 

While  the  k  —  1  runs  are  associated  with  the  k  —  1  two-factor  interactions  of  a  single 
factor  of  interest  in  a  k  factor  experiment,  the  manner  in  which  the  DSD  is  augmented 
can  easily  be  extended  to  additional  factors.  For  instance,  the  DSD  can  be  augmented 
with  k  —  1  +  k  —  2  =  2k  —  3  runs  for  all  the  two-factors  interactions  of  two  factors  and 
so  on  until  a  total  of  runs  are  added  for  all  the  two-factor  interactions  in  a  k 

factor  experiment.  As  such,  DSD  can  be  tailored  with  augmentation  runs  which  take 
the  DSD  from  the  standard  2k  +  1  runs  all  the  way  to  (fc+1Hfc+2)  runs  for  a  saturated 
second-order  design. 
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Table  40.  Second  Order  Screening  Design  Results:  Case  1 


Strong  Heredity,  Factor  Sparsity  Model:  Rep  2 

2 A  -  1.5 E  +  2 G-  3A2  +  2.5 E2  -  4 G2  +  AAE  +  3.5 AG  -  5 EG  +  e 

Scenario 

DSD 

DSD+ 

e  ~  JV(0, 1) 

Identified 

A,E,G,A21EG 

A,  E,  G,  A2,  E2,  G 2,  AE,  AG,  EG 

Type  I  errors 

B,  C,  D,  F,  Bz,  D2, 

BE ,  BF,  CE,  DF 

B,  C ,  F,  H,  F2,  BE, 

BF,  CG,  EH,  FH 

Type  II  errors 

E'2,  G'2,  AE:  AG 

NONE 

e  ~  iV(0,2) 

Identified 

A,  E,  G,  EG 

a,e,g,a2,g2,ae,ag,eg 

Type  I  errors 

B,  D,  F,  J,  F2,  J2, 

AF,  BD,  BF,  DJ,  FG 

D,  J,  J2,  DG,  DJ 

Type  II  errors 

A21E21G21AE1AG 

£  ~  N(0, 3) 

Identified 

E,E2 

A,  E,  G,  AE,  AG,  EG 

Type  I  errors 

C ,  H,  J,  CH,  EJ 

B,  F,  BF 

Type  II  errors 

A,G,A2,G2,AE, 

AG,  EG 

A2,  E2,  G2 

Table  41.  Second  Order  Screening  Design  Results:  Case  2 


Strong  Heredity,  Effect  Sparsity  Model:  Rep  2 

2 A  -  1.5 E  +  2G  +  4C  -  3H  +  2.5 E2  -  5 CG  +  3.5 EH  -  4GH  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  JV(0, 1) 

Identified 

A,E,C,G,  H,E2, 

CG,  EH,  GH,  EH,  GH 

A,  E,  C,  G,  H,  E2, 

CG,  EH,  GH,  EH,  GH 

Type  I  errors 

B,  F,  AF,  BC,  BG 

B,  D,  F,  J,  AH, 

BE,  DJ,  EG,  FG 

Type  II  errors 

NONE 

NONE 

£  ~  1V(0,2) 

Identified 

A,  E,  C,  G,  H 

A,  E,  C,  G,  H,  CG,  EH,  GH 

Type  I  errors 

B,  D,  J,  AD,  BE,  CE, 

CH,  DG,  EJ 

B,  D,  AB,  AD,  BD,  DG 

Type  II  errors 

E2,  CG,  EH,  GH 

~w~ 

£  ~  JV(0,3) 

Identified 

E,  C,  G,  H,  CG,  EH,  GH 

A,  E,  C,  G,  H,  CG,  EH,  GH 

Type  I  errors 

CH 

A2,  CH 

Type  II  errors 

A,E 2 

Table  42.  Second  Order  Screening  Design  Results:  Case  3 


Weak  Heredity,  Factor  Sparsity  Model:  Rep  2 

2A  +  2E  -  1.5A2  +  2.5 E2  -  3.5AE  +  4AG  -  5 EG  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  1V(0, 1) 

Identified 

A,E,A2,E2,AE,AG,EG 

A,E,A2,E2,AE,AG,EG 

Type  I  errors 

B,  C,  F,  G,  BE,  CE,  CF 
BE,  BF,  CG,  FH 

B,  C,  F,  G,  H,  J,  J2, 

Type  II  errors 

NONE 

NONE 

£  ~  1V(0, 2) 

Identified 

A,E,A2,AE,AG,EG 

A,  E,  A2,  AE,  AG,  EG 

Type  I  errors 

B,  D,  G,  J,  B2,  BD,  EJ 

D,  G,  J,  J2,  DG 

Type  II  errors 

~w~ 

~W~ 

£  ~  1V(0,3) 

Identified 

A,  E,  AE,AG,  EG 

A,  E,  AE,  AG,  EG 

Type  I  errors 

C,  G,  H,  CH 

B,  F,  G,  BF 

Type  II  errors 

A2,  E2 

A2,  E2 
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Table  43.  Second  Order  Screening  Design  Results:  Case  4 


Weak  Heredity,  Effect  Sparsity  Model:  Rep  2 

2 A  -  1.5 E  +  2 G  +  2.5 E2  -  3 H2  +  4 AC  -  5 CG  +  3.5 EH  -  4 GH  +  e 

Scenario 

DSD 

DSD+ 

£  ~  1V(0, 1) 

Identified 

A,  E,  G,  EH,  GH 

A,  E,  G,  E2,  H2, 

AC,  CG,  EH,  GH 

Type  I  errors 

B,  C,  F,  H,  G2,  AF, 

AH,  BE,  CF,  EF 

B,  C,  F,  H,  B2,  AB, 

BC,  BE,  BG,  CF 

Type  II  errors 

E2,  H2,  AC,  CG 

NONE 

£  ~  !V(0, 2) 

Identified 

A,  E,  G,  GH 

A,E,G,E2, 

AC,  CG,  EH,  GH 

Type  I  errors 

B,  C,  D,  F,  H,  J,  D2, 

AB,  AJ,  BF,  CE 

B,C,D,H,J,G2, 

AB,  AH,  BD,  BG,  BH 

Type  II  errors 

F2,  H2,  AC,  CG,  EH 

~TF~ 

e  ~  1V(0,3) 

Identified 

A,  E,  G 

A,  E,  G,  AC,  CG,  EH,  GH 

Type  I  errors 

B,  D,  AD,  BD,  BE 

C,  H,  CH 

Type  II  errors 

E2,H2,  AC,  CG,  EH,  GH 

E2,  H2 

Table  44.  Second  Order  Screening  Design  Results:  Case  1 


Strong  Heredity,  Factor  Sparsity  Model:  Rep  3 

2A  -  1.5 E  +  2G-  3A2  +  2.5 E2  -  4 G2  +  AAE  +  3.5 AG  -  5 EG  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  1V(0, 1) 

Identified 

A,  E,  G,  AE,  AG,  EG 

A,E,G,G2,AE,AG,EG 

Type  I  errors 

B,  C,  D,  F,  H,  J,  J2, 
AB,  DH,  EJ 

C,D,F,H,J, 

AH,  CF 

Type  II  errors 

A2,  E2,  G2 

A2,  E2 

£  ~  1V(0,2) 

Identified 

a,e,g,g2,eg 

a,e,g,g2,ae,ag,eg 

Type  I  errors 

B,  D,  F,  H,  B2,  AH, 
BF,  EF,  EH 

B,  C,  D,  F,  AF,  BC, 

CD,  CF 

Type  II  errors 

A2,  E2,  AE,  AG 

A2,  E2 

£  ~  1V(0,  3) 

Identified 

a,e,g,a2,g2,eg 

A,  E,  G 

Type  I  errors 

B,  C,  D,  F,  H,  J,  BF, 
CJ,  DF,  EH 

B,  F,  H,  J,  J2,  BJ,  EH, 

EJ,  FG 

Type  II  errors 

E2,  AE,  AG 

A2,E2,G2,AE,AG,EG 
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Table  45.  Second  Order  Screening  Design  Results:  Case  2 


Strong  Heredity,  Effect  Sparsity  Model:  Rep  3 

2 A  -  1.5 E  +  2G  +  AC  -3H  +  2.5 E2  -  5 CG  +  3.5 EH  -  AGH  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  iV(0, 1) 

Identified 

A,  E,  C,  G,  H,  E2,  CG , 
EH,  GH 

A,  E,  C,  G,  H,  E2,  CG, 

EH,  GH 

Type  I  errors 

B,  D,  F,  BH,  CD,  FG 

B,  D,  F,  J,  G2,  AB,  AJ,  FG 

Type  II  errors 

NONE 

NONE 

£  ~  N(0, 2) 

Identified 

A,  E,  C,  G,  H,  CG, 

EH,  GH 

A,  E,  C,  G,  H,  CG, 

EH,  GH 

Type  I  errors 

F,AF 

B,  D,  F,  G2,  AE,  AF, 

AH,  BF,  DH 

Type  II  errors 

~w~ 

~w~ 

£  ~  iV(0,3) 

Identified 

A,  E,  C,  G,  H,  E2,  CG, 
EH,  GH 

A,  E,  C,  G,  H,  E2,  CG, 

EH,  GH 

Type  I  errors 

B,  F,  J,  AB,  CE,  CF 

~A2 

Type  II  errors 

NONE 

NONE 

Table  46.  Second  Order  Screening  Design  Results:  Case  3 


Weak  Heredity,  Factor  Sparsity  Model: 

2A  +  2E-  1.5 A2  +  2.5 E2  -  3.5AE  +  4 AG 

Rep  3 

—  5  EG  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  iV(0, 1) 

Identified 

A,E,AE,AG,EG 

A,E,E2,AE,AG,EG 

Type  I  errors 

D,  F,  G,  H,  J,  D2, 

AJ,  DH 

B,  C,  D,  F,  G,  H,  J,  B  2, 

D2,  AJ,  CD,  DG,  FH,  GJ 

Type  II  errors 

A2,  E2 

~AT~ 

£  ~  iV(0,2) 

Identified 

A,  E,  AE 

A,E,AE,AG,EG 

Type  I  errors 

B,  D,  F,  G,  J,  G  2, 

AD,  BE,  BF,  DJ,  EF 

B,  C,  D,  F,  G,  B2,  AC, 

AD,  AF,  BC,  BF,  DF 

Type  II  errors 

A2,  E2,  AG,  EG 

A2,  E2 

£  ~  iV(0, 3) 

Identified 

A,  E 

A,E,E2,AE,AG,EG 

Type  I  errors 

B,  C,  D,  F,  G,  H,  J,  B  2, 

F2,  J2,  AD,  AJ,  CE,  DH 

G,  H,  EH 

Type  II  errors 

A2,  E2,  AE,  AG,  EG 

04^ 
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Table  47.  Second  Order  Screening  Design  Results:  Case  4 


Weak  Heredity,  Effect  Sparsity  Model:  Rep  3 

2 A  -  1.5E  +  2G  +  2.5 E2  -  3 H2  +  AAC  -  5 CG  +  3.5 EH  -  AGH  +  e 

Scenario 

DSD 

DSD+ 

£  ~  JV(0, 1) 

Identified 

A,  E,  G,  GH 

A,  E,  G,  AC,  CG,  EH,  GH 

Type  I  errors 

C,D,F,H,  AD,  AH, 
CE,DE,DG 

B,  C,  D,  F,  H,  J,  G2, 

AB,  AG,  AJ,  DJ,  FJ,  GJ 

Type  II  errors 

E'2,  H2,  AC,  CG,  EH 

E2,H2 

£  ~  N(0, 2) 

Identified 

A,  E,  G 

A,  E,  G, 

AC,  CG,  EH,  GH 

Type  I  errors 

B,  C,  D,  F,  D2,  AG, 

BD,  BF,  CE,  CF,  FG 

C,  H,  J,  J2 

Type  II  errors 

E2,H2,  AC,  CG,  EH,  GH 

E2,H2 

e  ~  1V(0,3) 

Identified 

A,  E,  G,  AC,  CG,  EH,  GH 

A,  E,  G,  AC,  CG,  EH,  GH 

Type  I  errors 

B,C,D,F,H,J, 

AF,  CE,  DJ 

B,C,F,H,  AB,  AH, 

BF,  CF,  EF,  FH 

Type  II  errors 

E2,H2 

E2,H2 

Table  48.  Second  Order  Screening  Design  Results:  Case  1 


Strong  Heredity,  Factor  Sparsity  Model:  Rep  4 

2 A  -  1.5 E  +  2G  -  3 A2  +  2.5 E2  -  AG2  +  AAE  +  3.5 AG  -  5 EG  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  1V(0, 1) 

Identified 

A,  E,  G,  E2,  AG,  EG 

a,e,g,e2,g2,ae,ag,eg 

Type  I  errors 

B,  C,  D,  F,  H,  J,  AD, 
BF,  BG,  CH,  EJ,  F  J 

B,C,D,F,H,J,F2,H2, 

AC,  BC,  CD,  CG,  DE,  EH 

Type  II  errors 

A2,  G2,  AE 

A42 

£  ~  iV(0,2) 

Identified 

A,  E,  G,  G2,  AE,  EG 

A,  E,  G,  G2,  AE,  AG,  EG 

Type  I  errors 

B,  D,  F,  AB,  BF,  DF 

B,  F,  H,  BF,  FH,  GH 

Type  II  errors 

A2,  E2,  AG 

A2,  E2 

£  ~  1V(0, 3) 

Identified 

a,e,g,g2,ag 

A,E,G,E2,AG,EG 

Type  I  errors 

B,  F,  BE,  BF 

J,  AJ,  GJ 

Type  II  errors 

A2,E2,AE,AG 

A2,  G2,  AE 
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Table  49.  Second  Order  Screening  Design  Results:  Case  2 


Strong  Heredity,  Effect  Sparsity  Model:  Rep  4 

2 A  -  1.5 E  +  2G  +  AC  -  3H  +  2.5 E2  -  5 CG  +  3.5 EH  -  AGH  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  iV(0, 1) 

Identified 

A,  E,  C,  G,  H,  CG ,  EH,  GH 

A,  E,  C,  G,  H,  CG,  EH,  GH 

Type  I  errors 

B,  D,  F,  J,  A2,  J'2,  BG,  EG,  FG 

J,A2,G'2,  J‘2,AC 

Type  II  errors 

~w~ 

~W~ 

£  ~  iV(0,2) 

Identified 

A,  E,  C,  G,  H,  CG ,  EH ,  GH 

A,  E,  C,  G,  H,  CG,  EH,  GH 

Type  I  errors 

B 

NONE 

Type  II  errors 

£  ~  iV(0,3) 

Identified 

A,  E,  C,  G,  H,  CG ,  EH,  GH 

A,  E,  C,  G,  H,  CG,  EH,  GH 

Type  I  errors 

F,  AF,  EF,  FH 

D,  F,  CD,  FH 

Type  II  errors 

~w~ 

~w~ 

Table  50.  Second  Order  Screening  Design  Results:  Case  3 


Weak  Heredity,  Factor  Sparsity  Model:  Rep  4 

2A  +  2E-  1.5 A2  +  2.5 E2  -  3.5AE  +  4 AG  -  5 EG  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  iV(0, 1) 

Identified 

A,E,E2,AE,AG,EG 

A,E,A2,E2,AE,AG,EG 

Type  I  errors 

B,C,D,F,H,J, 

J2,  AC,  BD,BJ 

B,C,D,F,G,H,J,B2, 
C2,G2,J2,BG,CE,DF, 
EH,  EJ,  HJ 

Type  II  errors 

~A2 

NONE 

£  ~  iV(0, 2) 

Identified 

A,  E,  AE 

A,  E,  AE,  AG,  EG 

Type  I  errors 

B,  F,  H,  J,  B2,  J2,  AH,  BF 

G,  H,  GH 

Type  II  errors 

A2,  E2,  AG,  EG 

A2,E 2 

£  ~  iV(0,3) 

Identified 

A,  E,  AE 

A,  E,  AE,  AG,  EG 

Type  I  errors 

B,  F,  BE,  BF 

F,  G,  J,  J2,  FJ,  GJ 

Type  II  errors 

A2,  E2,  AG,  EG 

A2,  E2 
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Table  51.  Second  Order  Screening  Design  Results:  Case  4 


Weak  Heredity,  Effect  Sparsity  Model:  Rep  4 

2 A  -  1.5E  +  2G  +  2.5 E2  -  3 H2  +  4AC  -  5 CG  +  3.5 EH  -  4GH  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  N(0, 1) 

Identified 

A,  E,  G,  GH 

A,  E,  G,  H2,  AC,  CG,  EH,  GH 

Type  I  errors 

B,C,D,F,H,J,B 2, 

AD,  AH,  CD,  DE,  DG,  GJ 

C,  D,  H,  J,  A2,  AD, 

AH,  CE,  CJ 

Type  II  errors 

E'2,  H2,  AC,  CG,  EH 

~w~ 

£  ~  1V(0,2) 

Identified 
Type  I  errors 

A, G,CG,GH 

B,  C,  H,  J,  A2,  AB,  CJ 

A,  E,  G,  AC,  CG,  EH,  GH 
C,D,H,  AD,  CD 

Type  II  errors 

E,  E2,  H2,  AC,  EH 

E2,H 2 

£  ~  iV(0,3) 

Identified 

A,E,E 2 

A,  E,  G,  H2,  AC,  CG,  EH,  GH 

Type  I  errors 

F,  H,  AF,  AH,  FH 

C,D,F,H,  CD,  FH 

Type  II  errors 

G,  H2,  AC,  CG,  EH,  GH 

~w~ 

Table  52.  Second  Order  Screening  Design  Results:  Case  1 


Strong  Heredity,  Factor  Sparsity  Model:  Rep  5 

2A  -  1.5 E  +  2G-  3A2  +  2.5 E2  -  4G2  +  4AE  +  3.5 AG  -  5 EG  +  £ 

Scenario 

DSD 

DSD+ 

£  ~  1V(0, 1) 

Identified 

A,  E,  G,  AE,  AG,  EG 

A,E,G,A2,E2,G2, 
AE,AG,  EG 

Type  I  errors 

B,  C,  F,  J,  B2,  J2,  CE,  EJ 

B,  C,  D,  F,  H,  J,  C2,  D2, 
AB,  AC,  BJ,  DE,  F  J,  HJ 

Type  II  errors 

A2,  E2,  G2 

NONE 

£  ~  1V(0, 2) 

Identified 

A,E,G,A2,EG 

A,E,G,A2,E2,G2, 
AE,AG,  EG 

Type  I  errors 

B,  C,  D,  F,  J,  F2,  BF,  CJ,  DF 

C,  F,  AF 

Type  II  errors 

E2,  G2,  AE,  AG 

NONE 

£  ~  1V(0,3) 

Identified 

A,  E,  G,  EG 

A,  E,  G,  A2,  AE,  AG,  EG 

Type  I  errors 

C,  D,  J,  D2,  AJ,  CE,  CG,  EJ 

D,  J,  J2,  EJ 

Type  II  errors 

A2,E2,G2,AE,AG 

E2,G2 

95 


Table  53.  Second  Order  Screening  Design  Results:  Case  2 


Strong  Heredity,  Effect  Sparsity  Model:  Rep  5 

2 A  -  1.5E  +  2G  +  AC  -3H  +  2.5 E2  -  5 CG  +  3.5 EH  -  AGH  +  e 

Scenario 

DSD 

DSD  + 

£  ~  JV(0, 1) 

Identified 

A,  E,  C,  G,  H,  E2,  CG,  EH,  GH 

A,  E,  C,  G,  H,  E2,  CG,  EH,  GH 

Type  I  errors 

B,  D,  F,  J,  BD,  CH,  FG 

B,  F,  J,  A2,  AF,  BC,  BG,  EJ 

Type  II  errors 

NONE 

NONE 

£  ~  JV(0, 2) 

Identified 

A,  E,  C,  G,  H,  E2,  CG,  EH, 

A,  E,  C,  G,  H,  E2,  CG,  EH,  GH 

Type  I  errors 

F,  AC,  CE,  EF 

F,G2,AF 

Type  II  errors 

GH 

NONE 

e  ~  N(0, 3) 

Identified 

A,  E,  C,  G,  H,  CG,  EH,  GH 

A,  E,  C,  G,  H,  CG,  EH,  GH 

Type  I  errors 

J,  J 2 ,  AJ 

B,  D,  J,  J2,  AE,  BG,  CD 

Type  II  errors 

~W~ 

~w~ 

Table  54.  Second  Order  Screening  Design  Results:  Case  3 


Weak  Heredity,  Factor  Sparsity  Model:  Rep  5 

2A  +  2E-  1.5 A2  +  2.5 E2  -  3.5 AE  +  4 AG  -  5 EG  +  e 

Scenario 

DSD 

DSD+ 

£  ~  !V(0, 1) 

Identified 

A,  E,  A2,  AE,  EG 

A,  E,  A2,  E2,  AE,  AG,  EG 

Type  I  errors 

B,  C,  D,  F,  G,  J, 

B2,  J2,  BD,  BF,  CE 

B,  C,  F,  G,  J,  BJ,  CF 

Type  II  errors 

E2,  AG 

NONE 

£  ~  1V(0,2) 

Identified 

A,E,AE,AG,EG 

A,  E,  E2,  AE,  AG,  EG 

Type  I  errors 

C,  F,  G,  J,  F2,  AC,  AJ,  CF 

C,  F,  G,  H,  AF,  CF,  GH 

Type  II  errors 

A2,  E2 

~AT~ 

£  ~  JV(0,3) 

Identified 

A,  E,  AE 

A,  E,  A2,  AE,  AG,  EG 

Type  I  errors 

B,  C,  F,  G,  J,  G2,  BF,  EJ,  GJ 

B,  C,  G,  J,  J2,  AB,  BC,  BG,  EJ 

Type  II  errors 

A2,  E2,  AG,  EG 

~w~ 
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Table  55.  Second  Order  Screening  Design  Results:  Case  4 


Weak  Heredity,  Effect  Sparsity  Model:  Rep  5 

271  -  1.5 E  +  2 G  +  2.5 E2  -  3 H2  +  AAC  -  5 CG  +  3.5 EH  -  4 GH  +  £ 

Scenario 

DSD 

DSD+ 

e  ~  N(Q,  1) 

Identified 

A,  E,  G,  GH 

A,  E,  G,  E2,  H2,  AC,  CG,  EH,  GH 

Type  I  errors 

B,C\F,H,  J,B  \ 
G2,AE,AF,AH ,  EF 

B,C,D,F,H,J,D2, 

AF,  BE,  CE,  DF,  DH,  FJ 

Type  II  errors 

E'2,  H\  AC ,  CG,  EH 

NONE 

£  ~  N(0, 2) 

Identified 
Type  I  errors 

A,E,G,AC,  CG 

C,  F,  C2,  G2,  AF,  CE,  FG 

A,  E,  G,  E2,  H2,  AC,  CG,  EH,  GH 
C,F,H,  A2,G2,AF,CE 

Type  II  errors 

E2,  FT2,  EH ,  GH 

NONE 

£  ~  N(0, 3) 

Identified 

A,  E,  G 

A,  E,  G,  AC,  CG,  EH,  GH 

Type  I  errors 

B,  C,  D,  F,  J,  G  2, 

AB,BD ,  BF,  CE 

C,  H,  J,  J 2 ,  EG 

Type  II  errors 

E2,  H2,  AC,  CG,  EH,  GH 

E2,H2 
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V.  Nonlinear  Screening  Designs  for  Defense  Testing:  An 

Overview  and  Case  Study 


5.1  Introduction 

“Necessity  is  the  Mother  of  Invention.”  Plato  is  often  credited  with  authoring 
this  quote,  but  whether  he  is  the  true  author  or  not  remains  unknown.  However,  not 
knowing  the  author  does  not  diminish  the  meaning  and  impact  this  simple  quote  has 
with  regards  to  the  situation  the  Department  of  Defense  (DOD)  and  in  particular 
the  Defense  Acquisition  Test  and  Evaluation  community  find  themselves  in  today. 
Available  resources,  whether  they  be  personnel,  budgets,  or  facilities,  are  continuing 
to  shrink.  Meanwhile,  acquisition  efforts  are  reducing  timelines,  even  though  systems 
are  becoming  increasingly  more  complex.  As  a  result,  testing  methodologies  which 
optimize  the  employment  of  resources  are  gaining  emphasis  and  acceptance. 

In  a  2010  memorandum,  Dr.  Gilmore  provided  key  policy  guidance  on  the  use  of 
Design  of  Experiments  (DOE)  in  OT&E.  Furthermore  the  DOT&E  Scientific  Advisor 
(SA),  Dr.  Catherine  Warner,  highlighted  the  fact  that  while  DOE  is  a  structured, 
rigorous  statistical  tool  for  test  planning  and  analysis,  and  it  has  been  written  about 
extensively  within  the  academic  setting,  there  are  still  many  questions  regarding  how 
to  apply  DOE  to  T&E  within  DOD  (Warner,  2011). 

In  April  2012,  Dr.  Steven  Hutchison,  Principal  Deputy,  Office  of  the  Deputy  As¬ 
sistant  Secretary  of  Defense  for  Developmental  Test  and  Evaluation  (DASD(DT&E)), 
stated,  “By  applying  scientific  methods  to  the  test  design,  we  can  not  only  achieve 
great  efficiencies,  but  we  can  significantly  improve  confidence  in  our  results.  The 
Scientific  Test  and  Analysis  Techniques  in  Test  &  Evaluation  Center  of  Excellence 
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(STAT  T&E  COE)  will  provide  a  critical  venue  for  enhancing  the  test  design  for  DOD 
acquisition  programs.” 

Finally,  in  July  2013,  Dr.  Gilmore  published  a  best  practices  memorandum  for 
the  statistical  adequacy  of  operational  test  and  evaluation  (Gilmore,  2013).  Included 
as  an  attachment  to  the  memo  was  a  white  paper  of  best  practices.  One  of  the 
four  identihed  test  objectives  was  screen  for  important  factors  that  affect  system 
performance.  While  an  important  characteristic  of  test,  the  included  potentially 
useful  experimental  designs  focused  just  on  linear  effects. 

This  paper  presents  the  use  of  design  of  experiments  and  response  surface  designs 
in  the  area  of  second-order  screening  designs,  particularly  as  applied  to  defense  test¬ 
ing.  Extensions  to  existing  designs  are  examined  with  respect  to  improvements  in 
robustness  and  applicability  to  defense  testing.  A  wind  tunnel  case  study  to  demon¬ 
strate  a  viable  use  of  a  second-order  screening  design. 

5.2  Background 

Military  systems,  particularly  aerodynamic  systems,  are  complex.  It  is  not  unusual 
for  these  systems  to  exhibit  nonlinear  behavior.  Developmental  testing  may  be  tasked 
to  characterize  the  nonlinear  behavior  of  such  systems  while  being  asked  to  reduce 
the  amount  of  testing  accomplished. 

The  one-factor-at-a-time  (OFAT)  experimentation  strategy  consists  of  successively 
varying  each  factor  independently  over  its  range  while  holding  all  the  remaining  fac¬ 
tors  at  baseline  settings.  The  OFAT  method  saturates  the  experimental  design  space 
and  provides  the  capacity  to  determine  how  each  factor  affects  the  response  variable 
while  all  other  factors  are  held  constant.  An  overwhelming  disadvantage  of  the  OFAT 
strategy  is  that  it  does  not  consider  any  possible  interaction  between  factors.  Addi- 


99 


tionally,  as  Hill  et  al.  (2011)  point  out,  the  OFAT  strategy  is  not  cost  effective  nor 
does  it  control  experimental  uncertainty  or  produce  minimum  variance  predictions. 

Despite  the  disadvantages  and  deficiencies  of  implementing  an  OFAT  experimen¬ 
tation  strategy,  conventional  wind  tunnel  tests,  which  are  a  critical  factor  in  the 
Developmental  Test  and  Evaluation  (DT&E)  of  aeronautical  systems,  are  usually 
conducted  in  a  manner  consistent  with  the  OFAT  methodology.  Hill  et  al.  (2011) 
point  out  current  high-performance  military  aircraft  development  programs  require 
up  to  3700  wind  tunnel  test  hours  in  the  conceptual  design  phase  and  up  to  18,500 
hours  in  the  development /validation  phase.  Such  requirements  become  tenuous  dur¬ 
ing  a  time  when  the  United  States  Air  Force  (USAF)  and  DOD  are  seeking  reductions 
in  developmental  schedules  and  budgetary  requirements. 

DOE,  or  experimental  design,  is  a  statistical  technique  used  to  organize  an  exper¬ 
imental  test  or  series  of  tests  in  a  manner  such  that  observed  changes  in  an  output 
response  can  be  attributed  to  systematic  changes  made  to  the  input  variables  of  a 
process  or  system  (Montgomery,  2013).  While  the  designs  are  based  upon  statistical 
techniques,  the  actual  design  forms  vary  greatly  depending  upon  the  form  of  the  em¬ 
pirical  model  used  to  represent  the  process  or  system  response.  Typically,  first-order 
polynomial  models  are  used  extensively  in  screening  experiments  while  second-order 
polynomial  models  are  commonly  used  in  modeling  and  optimization  experiments. 

When  a  system  or  process  is  new,  screening  designs  are  usually  performed  to  de¬ 
termine  which  of  the  many  factors  (if  any)  have  a  significant  effect  on  the  system 
or  process  response.  Screening  designs  usually  assume  a  linear  (main  effects  or  main 
effects  plus  interaction)  response  so  as  to  not  waste  valuable  resources  when  experi¬ 
menters  do  not  know  much  about  the  system  or  process  being  studied.  The  assump¬ 
tion  that  the  response  is  approximately  linear  for  many  factor  screening  experiments 
is  reasonable  when  a  system  or  process  is  just  starting  to  be  studied.  However,  there 
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are  times  when  subject  matter  expertise  or  historical  data  indicates  a  second-order 
polynomial  response  is  more  reasonable. 

Traditionally,  when  a  response  is  suspected  of  exhibiting  second-order  behavior, 
experiments  are  conducted  sequentially.  First,  screening  for  important  factors  is  con¬ 
ducted  assuming  a  linear  response  in  main  effects  and  then  follow-on  experiments 
focus  on  second-order  models  using  those  important  factors  identified  in  the  screen¬ 
ing  experiment.  However,  there  are  times  when  conducting  multiple  experiments 
sequentially  is  unrealistic  due  to  time,  budget,  or  other  constraints.  For  instance, 
within  the  agricultural  held  the  time  duration  of  the  design  can  be  exceedingly  long 
and/or  within  a  manufacturing  setting  experimental  preparation  can  be  overly  time- 
consuming.  Directly  applicable  to  the  DOD,  Lawson  (2003)  points  out  fixed  deadlines 
for  scale-up  and  production  of  prototype  engineering  designs  may  not  allow  the  pos¬ 
sibility  of  follow-up  experimentation. 

In  these  instances,  it  would  be  better,  if  not  necessary,  to  perform  factor  screening 
and  response  surface  exploration  on  the  same  experiment  vice  conducting  experiments 
sequentially.  This  has  significant  implications  for  experimental  screening  designs. 
Second-order  screening  designs  are  extremely  important  when  working  with  new  or 
existing  systems/technologies  believed  to  exhibit  nonlinear  system  responses  so  valu¬ 
able  resources  will  not  be  wasted  using  best  guess  and  one-factor- at-a-time  (OFAT) 
approaches.  In  the  following  sections,  we  provide  an  overview  of  screening  designs 
and  illustrate  the  use  of  a  single  augmented  Definitive  Screening  Designs  (DSD+) 
for  determining  significant  factors  associated  with  transonic  and  supersonic  subspace 
wind  tunnel  testing  data. 
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5.3  Screening  Designs  Overview 


Many  experiments  start  by  considering  many  factors,  which  in  turn  increases  the 
overall  size  and  cost  of  the  experiment.  Since  in  reality  no  two  experiments  are  exactly 
the  same,  a  multitude  of  screening  designs  are  available  for  use  depending  upon  any 
number  of  variables.  For  the  novice  practitioner,  the  most  obvious  variables  are  the 
number  of  factors  being  considered  and  the  number  of  available  experimental  runs. 
However,  variables  such  as  the  range  over  which  factors  will  be  varied  and  the  number 
of  distinct  levels  at  which  runs  will  be  made  are  equally  important.  More  importantly 
are  issues  associated  with  whether  or  not  follow-up  experiments  will  be  available,  the 
shape  of  the  design  region,  the  ease  of  which  statistical  analysis  of  data  can  be  done 
and  subject  matter  expertise  affect  screening  design  selection.  Lastly,  design  selection 
depends  greatly  upon  the  form  of  the  empirical  model  used  to  represent  the  process 
or  system  response. 

Traditionally,  research  has  involved  concepts  like  design  resolution,  minimum  aber¬ 
ration,  power,  the  number  of  clear  (non-confounded)  effects,  concepts  like  rotatability, 
alphabetical-optimality,  and  prediction  variance. 

Resolution,  generally  denoted  in  roman  numerals,  is  the  measure  of  the  degree  of 
complete  confounding  for  main  effects  and  interactions  in  a  fractional  factorial  design. 
The  confounding  characteristics  of  these  design  resolutions  are: 

•  Res  III:  Main  effects  clear  of  other  main  effects,  at  least  one  main  effect  is 
confounded  with  at  least  one  two-way  interactions. 

•  Res  IV:  Main  effects  are  clear  of  two-way  interactions,  but  at  least  one  two-way 
interaction  is  confounded  with  at  least  one  other  two-way  interaction. 
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•  Res  V:  Main  effects  and  two-way  interaction  are  clear  of  any  other  main  effect 
or  two-way  interaction,  but  at  least  one  two-way  interaction  is  confounded  with 
at  least  one  three-way  interaction. 

There  are  times  however  that  different  designs  can  possess  the  same  resolution  and 
fractionation  but  have  different  confounding  or  aliasing  structure.  Fries  and  Hunter 
(1980)  proposed  the  concept  of  design  aberration  for  regular  two-level  designs  as  a 
means  to  differentiate  between  these  designs.  Since  Fries  and  Hunter  initial  work,  the 
concept  of  minimum  aberration  criterion  has  been  extended  to  two-level  non-regular, 
multilevel,  and  mixed-level  fractional-factorial  designs  (Guo  et  ah,  2009). 

Optimal  designs  are  typically  assessed  based  upon  specific  criteria  like  providing 
good  estimation  of  model  parameters  or  good  prediction  capacity  within  the  design 
region.  Alphabetic-optimality  refers  to  the  family  of  design  optimality  criteria  that 
are  characterized  by  a  letter  of  the  alphabet,  currently  A—,  D-,  G-,  V—,  or  I—. 
These  alphabetical- optimality  criteria  drive  what  constitutes  an  optimal  design.  These 
“optimal”  designs  are  rather  focused  on  a  particular  design  characteristic.  Two  of 
the  most  popular  methods  of  assessing  optimality  are  I—  and  D— optimality.  Where 
D— optimal  designs  focus  on  good  model  parameter  estimates,  and  /—optimal  designs 
focus  on  good  prediction  capacity  within  the  design  region  by  focusing  on  the  scaled 
prediction  variance.  Since  I—  criteria  are  prediction-oriented  and  D—  criteria  are 
parameter-oriented,  they  are  mostly  used  for  second-order  and  first-order  designs, 
respectively.  For  more  on  alphabetic- optimality,  please  see  Chapter  8  in  (Myers  and 
Anderson-Cook,  2009). 

Screening  designs  usually  assume  a  linear  (main  effects  or  main  effects  plus  inter¬ 
action)  response  so  factors  can  be  studied  at  two  levels  thereby  conserving  experimen¬ 
tal  resources.  Popular  experimental  regular  and  nonregular  designs  used  in  screening 
experiments  are  full  and  fractional  2-level  factorial  designs,  Plackett-Burman,  and 
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supersaturated  designs.  Regular  designs  are  designs  constructed  through  defining 
relations  among  its  factors,  whereas,  nonregular  designs  lack  such  a  defining  relation. 

The  2k  Factorial  Design  consists  of  k  factors  each  at  only  two  levels  and  is  a  special 
case  of  the  full  factorial  design  with  2k  observations  per  replication.  2k  designs  have 
many  useful  properties.  In  addition  to  being  orthogonal,  2k  designs  are  /—optimal  for 
fitting  a  first-order  model  or  first-order  model  with  interactions  (Montgomery,  2013). 
The  2fc-type  designs  are  widely  used  for  factor  screening  as  it  provides  the  smallest 
number  of  runs  for  independently  estimating  all  main  effects  and  interactions  for  k 
factors. 

The  2k~p  Fractional  Factorial  Design  uses  a  subset  of  the  runs  of  the  2k  Factorial 
Design.  Similar  to  the  2k  Factorial  Design,  the  2k~p  Fractional  Factorial  Designs 
consists  of  k  factors  each  at  only  two  levels.  However,  the  value  of  p  specifies  the 
degree  to  which  the  design  is  fractionated,  determined  by  1/2P.  Generally,  the  first 
k  —  p  independent  columns  are  generated  by  the  runs  in  the  2k~p  design.  In  the 
2k~p  design,  the  first  k  —  p  columns  are  generated  by  the  runs  associated  with  the 
2k~p  full  factorial  design.  The  remaining  p  columns  can  be  generated  as  interactions 
of  the  first  k  —  p  columns  (Wu  and  Hamada,  2011).  Because  the  design  generators 
were  determined  by  column  interactions,  the  p  factor  effect  estimates  are  aliased, 
meaning  the  factor  effects  on  the  system  response  can  not  be  estimated  separately 
from  factor  interactions.  The  degree  to  which  the  effects  are  aliased  is  given  by  the 
design  resolution. 

Plackett  and  Burman  (1946)  developed  nonregular  two-level  fractional  factorial 
designs  which  can  study  k  =  N  —  1  variables  in  N  runs,  where  N  is  a  multiple  of 
4.  If  N  =  2*  for  i  >  2  ,  PB  designs  are  synonymous  with  2k  factorial  designs.  The 
nonregular  Plackett-Burman  designs  sacrifice  a  simple  alias  structure  for  better  run 
economy  and  projectivity  when  compared  to  regular  2k~p  designs.  Unfortunately, 
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PB  designs  have  complex  alias  structures.  As  a  result,  analysis  of  PB  designs  can 
become  complex.  Hamada  and  Wu  (1992)  discuss  methods  for  analyzing  designs  with 
complex  aliasing  based  upon  the  sparsity  of  effect  and  effect  heredity  principles. 

Supersaturated  designs  are  nonregular  fractional  factorial  design  where  the  num¬ 
ber  of  factors  k  under  investigation  exceeds  the  number  of  available  experimental 
runs  N.  Since  k  >  N  —  1,  the  degrees  of  freedom  within  the  design  are  insufficient  to 
estimate  all  the  main  effects  and  the  design  matrix  cannot  be  orthogonal.  Therefore 
in  order  for  supersaturated  designs  to  useful  as  screening  designs  only  a  few  factors 
can  be  active.  As  such  supersaturated  designs  are  generally  used  when  the  number  of 
potential  factors  is  large  but  few  are  believed  to  have  actual  effects  (effect  sparsity) 
and  either  budget  or  time  constraints  limit  the  number  of  experimental  runs.  Some 
care  must  be  exercised  in  the  selection  of  a  SSD.  Since  SSD  can  not  obtain  orthog¬ 
onality,  the  SSD  could  produce  misleading  results  if  the  design  departs  considerably 
from  an  orthogonal  design.  E(s2)  gives  an  intuitive  measure  of  nonorthogonality  the 
smaller,  the  better. 

When  the  response  is  believed  to  possess  significant  curvature,  each  factor  needs 
at  least  three  levels.  If  follow-on  experiments  are  available,  two-level  regular  designs 
can  be  augmented  with  follow-on  design  runs  to  accommodate  curvature.  However 
there  are  designs  which  are  robust  to  the  linear  effect  assumption  such  as  the  3k  or 
3k~p  fractional  factorial  design,  the  Central  Composite  Design  (CCD),  Box-Behnken 
Design  (BBD),  and  saturated/near-saturated  Hoke ,  Hybrid ,  and  Small  Composite 
Designs  (SCD). 

The  3k  Factorial  Design,  which  consists  of  k  factors  each  at  only  three  levels,  is 
a  special  case  of  the  full  factorial  design  with  3k  observations  per  replication.  The 
addition  of  a  third  factor  level  over  the  2k  design  allows  the  response  to  be  modeled 
as  a  quadratic  function. 
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The  3k~p  Fractional  Factorial  Designs  consists  of  k  factors  each  at  three  levels. 
The  value  of  p  again  specifies  the  degree  to  which  the  design  is  fractionated,  de¬ 
termined  by  1/3P.  A  general  procedure  for  constructing  a  3k~p  fractional  factorial 
design  is  given  by  Montgomery  (2013).  Connor  and  Zelen  (1959)  and  Xu  (2005) 
provide  an  extensive  list  of  3k~p  designs.  Unfortunately,  especially  as  compared  to 
2 k~p  designs,  the  aliasing  structure  for  3k~p  designs  is  very  complex  especially  as  the 
level  of  fractioning  increases.  If  effect  interactions  are  not  negligible,  design  results 
can  be  difficult  if  not  nearly  impossible  to  interpret  because  of  the  partial  aliasing  of 
two-degree-of- freedom  components  (Montgomery,  2013). 

Box  and  Wilson  (1951)  introduced  an  alternative  class  of  designs  to  the  3k  factorial 
designs.  The  Central  Composite  Designs  (CCD)  contain  a  2k  or  2k7v  design,  axial/star 
runs,  and  center  runs  which  are  set  at  the  middle  of  the  factor  range.  The  axial/star 
runs  are  selected  so  as  to  maintain  a  rotatable  or  near-rotatable  design  so  that  the 
variance  of  predicted  response  is  constant  (Montgomery,  2013).  As  such,  the  CCD 
typically  involve  k  factors  at  5  levels  per  factor.  The  CCD  are  popular  design  because 
of  the  sequential  nature  in  which  they  can  be  implemented. 

Box  and  Behnken  (1960)  developed  a  family  of  efficient  rotatable/near-rotatable 
spherical  three-level  designs  suitable  for  fitting  second-order  (quadratic)  response 
models.  In  contrast  to  the  CCD,  the  Box-Behnken  design  does  not  contain  any 
points  at  the  vertices  or  face-center  of  the  design  but  rather  at  the  center  of  the  edges 
of  the  process  space.  As  a  result,  the  Box-Behnken  designs  avoid  extreme  values  for 
factor-level  combinations  which  may  be  impossible  to  test  due  to  cost  or  physical  pro¬ 
cess  constraints  (Montgomery,  2013).  The  BBD  are  formed  by  varying  p  parameters 
in  a  full  factorial  manner  while  the  remaining  k  —  p  parameters  are  kept  steady  at 
the  center  factor  level  setting.  Additionally,  the  BBD  uses  three  to  five  center  runs 
to  avoid  singularity  in  the  design  matrix  and  to  maintain  favorable  design  qualities 
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(Myers  and  Anderson-Cook,  2009).  Overall,  the  design  run  requirements  for  both 
the  BBD  and  CCD  are  comparable.  As  a  result,  the  benefit  of  employing  a  BBD 
design  over  a  CCD  is  not  necessarily  due  to  run  efficiency  but  rather  the  factor  level 
combination  location  in  the  design  space. 

Through  the  years  some  of  the  original  BBD  have  been  improved  upon  in  terms  of 
rotatability,  average  prediction  variance,  D—  and  G— efficiency  (Nguyen  and  Borkowski, 
2008).  In  addition,  new  Box-Behnken  type  designs  with  larger  k  (Mee,  2000)  and  dif¬ 
fering  orthogonally  blocked  solutions  (Nguyen  and  Borkowski,  2008)  than  the  original 
BBD  have  been  proposed.  Most  recently  small  Box-Behnken  Designs  (SBBD)  have 
been  proposed  which  reduce  the  run  size  requirement  of  the  original  BBD  by  replacing 
the  full  2k  factorial  designs  partly  by  2 designs  and  partly  by  full  factorial  designs 
(Zhang  et  al.,  2011).  When  compared  to  the  original  BBD,  the  SBBD  possess  smaller 
D— efficiency  values  but  the  values  are  still  relatively  high  (>  70%)  for  k  <  11  while 
requiring  fewer  runs. 

While  reduced  run  designs  like  the  CCD  and  the  BBD  provide  more  efficient 
designs  than  the  full  model  estimable  designs  2k  and  3fc,  these  designs  still  can  possess 
far  more  design  points  than  needed  to  estimate  the  second-order  response  effects. 
As  a  result,  the  class  of  saturated  or  near-saturated  designs  have  been  developed. 
Saturated  or  near-saturated  designs  are  designs  such  that  the  number  of  design  points 
are  equal  to  or  near,  but  not  less  than,  the  number  of  terms  in  the  design  model. 

Hoke  (1974)  presented  a  class  of  second-order  designs  for  k  =  3  to  6  factors  at 
3  levels  based  on  saturated  and  near-saturated  irregular  fractions  of  the  3k  factorial. 
For  each  number  of  factors  k,  several  versions  of  the  Hoke  designs  exist  consisting  of 
a  mixture  of  factorial,  axial,  and  edge  points  making  the  Hoke  designs  suitable  for  a 
cuboidal  region  of  interest  (Myers  and  Anderson-Cook,  2009). 
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Roquemore  (1976)  presented  a  set  of  saturated  or  near-saturated  second-order 
designs  for  k  =  3  to  6  factors  which  are  rotatable  or  near-rotatable  while  achieving 
the  same  degree  of  orthogonality  as  a  CCD.  The  hybrid  designs  for  k  variables  is 
constructed  by  first  augmenting  a  k  —  1  variable  central  composite  design  with  an 
additional  column  for  variable  k.  The  design  is  then  augmented  with  additional  runs 
for  variable  k  at  different  levels  to  create  desirable  design  properties. 

In  contrast  to  the  CCD,  which  contain  a  2k  or  2y  p  factorial  design,  Hartley  (1959) 
suggested  replacing  the  factorial  design  with  a  special  resolution  III  factorial  design, 
where  two-factor  interactions  are  not  aliased  with  other  two-factor  interactions.  As  a 
result,  the  number  of  design  runs  is  decreased  resulting  in  Small  Composite  Designs 
(SCD).  The  SCD  sacrifices  good  prediction  variance  properties  with  the  reduction  in 
run  size  because  main  effects  could  be  aliased  with  two-factor  interactions.  However, 
the  SCD  design  still  allows  for  the  estimation  of  all  main-effect  because  the  star 
portion  of  the  design  provides  additional  information. 

Unfortunately,  while  the  3k  or  3k~p  fractional  factorial  design,  the  Central  Compos¬ 
ite  Design  (CCD),  Box-Behnken  Design  (BBD),  and  saturated/near-saturated  Hoke , 
Hybrid ,  and  Small  Composite  Designs  (SCD)  are  robust  to  the  linear  effect  assump¬ 
tion,  these  designs  are  not  very  run  size  efficient  in  terms  of  screening  designs  as  they 
are  built  to  accommodate  curvature  for  all  factors  under  consideration. 

As  a  result,  recent  literature  has  proposed  employing  a  single  experimental  de¬ 
sign  capable  of  preforming  both  factor  screening  and  response  surface  exploration 
when  conducting  multiple  experiments  is  unrealistic  due  to  time,  budget,  or  other 
constraints.  Initial  attempts  to  use  designs  capable  of  performing  both  factor  screen¬ 
ing  and  response  surface  exploration  with  a  single  design  relied  upon  the  designs 
projection  capacity. 
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Cheng  and  Wu  (2001),  hereafter  referred  to  as  CW,  introduced  a  two-stage  anal¬ 
ysis  method  where  the  key  linkage  between  stages  was  the  ability  to  project  the 
initial  larger  factor  space  onto  a  smaller  factor  space  capable  of  fitting  a  second-order 
model.  Because  a  design  can  project  onto  many  different  combinations  of  factors, 
a  projection-efficiency  criterion  was  developed  to  compare  orthogonal  designs  based 
upon  (1)  the  number  of  eligible  projected  designs  (designs  which  can  fit  a  second-order 
model)  and  (2)  the  estimation  efficiency  for  eligible  projected  designs  determined  by 
the  ratio  of  each  designs  D—  and  G— efficiences  (Cheng  and  Wu,  2001). 

CW  studied  three  orthogonal  array  (OA)  designs  which  demonstrated  desirable 
projection  properties.  In  contrast  to  3k~p  designs  which  have  defining  contrast  sub¬ 
groups  to  describe  the  design  structure,  the  designs  studied  by  CW  required  com¬ 
puter  search  to  classify  the  possible  projected  designs.  Fortunately,  while  more  com¬ 
plex,  the  overall  projection  properties  are  better  and  generally  required  less  runs. 
When  compared  to  CCDs,  the  designs  studied  exhibited  good  D— efficiences  but  poor 
G— efficiences  as  the  number  of  projected  factors,  increases. 

Improving  on  the  designs  of  CW,  Xu  et  al.  (2004),  hereafter  referred  to  by  XCW, 
proposed  a  combinatorial  method  for  constructing  new  and  efficient  OA  designs  and 
a  design  selection  approach  based  upon  a  projection  aberration  criterion  (Xu  and  Wu, 
2001)  for  factor  screening  and  the  projection-efficiency  criteria  (Cheng  and  Wu,  2001) 
for  interaction  detection.  XCW’s  three-step  approach  involves:  (1)  screening  out  poor 
orthogonal  arrays  (OA)  for  factor  screening  using  the  generalized  word-length  pattern, 
(2)  applying  the  projection  aberration  criterion  to  select  a  best  design  from  step  1, 
and  (3)  determining  the  best  level  permutations  of  the  design  from  step  2  to  improve 
design  projection  eligibility  and  estimation  efficiency  under  the  second-order  model. 

Ye  et  al.  (2007),  hereafter  referred  to  as  YTL,  also  examined  3-level  18-run  and  27- 
run  orthogonal  designs;  however,  in  addition  to  considering  the  projection  properties 
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of  designs,  their  design  choices  were  based  on  both  model  estimation  and  model 
discrimination  criteria. 

While  previous  work  focused  primarily  on  the  designs  projection  capacity,  Edwards 
and  Truong  (2011)  applied  Jones  and  Nachtsheim  (2011b)  method  for  finding  efficient 
designs,  deemed  MA  designs,  with  minimal  aliasing  between  main  effects  and  two- 
factor  interactions.  Edwards  and  Truong  (2011)  compared  the  27-run  orthogonal 
arrays  of  XCW  and  YTL  with  MA  designs  in  terms  of  H-efficiency  of  projection  and, 
via  a  simulation  study,  the  proportion  of  active  factors  declared  significant  (Power  1) 
and  the  proportion  of  simulations  in  which  only  the  true  active  factors  are  declared 
significant  (Power  2).  Although  ranked  last  in  terms  of  id-efficiency,  the  MA  designs 
showed  superior  performance  with  their  ability  to  detect  active  factors  (Edwards  and 
Truong,  2011). 

A  common  thread  connecting  all  CW,  XCW,  YTL,  and  MA  designs  is  the  use  of  a 
linear  and  quadratic  main-effects  only  analysis  for  factor  screening.  Unfortunately,  if 
the  strong  effect  heredity  principle  fails  to  hold  important  interactions  can  be  missed 
leading  to  a  misspecified  response  surface  model.  However,  if  the  concern  exists  where 
a  factor’s  significance  is  only  present  in  interactions  with  other  factors,  the  authors 
proposed  either  the  Bayesian  approaches  of  Box  and  Meyer  (1993)  or  Chipman  et  al. 
(1997)  to  account  for  significant  factors  outside  of  main  effects  when  the  strong  effect 
heredity  principle  fails  to  hold  (Cheng  and  Wu,  2001).  Unfortunately,  these  methods 
are  not  readily  available  to  practitioners  in  statistical  software  packages  and  are  com¬ 
putationally  intensive  procedures,  thus  likely  making  their  use  impractical  (Edwards 
and  Truong,  2011).  Another  area  of  concern  for  the  CW,  XCW,  YTL,  and  MA  de¬ 
signs  is  the  projection  of  main  and/or  quadratic  effects  deemed  significant  during  the 
first  stage  analysis  does  not  always  yield  a  second-order  design. 
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Edwards  and  Mee  (2011)  introduced  new  spherical  Fractional  Box-Behnken  de¬ 
signs  (FBBD)  aimed  at  overcoming  the  projection  deficiencies  and  main/quadratic 
effect  only  analysis  issues  found  in  the  CW/XCW/YTL/MA  designs.  The  FBBD 
provide  the  ability  to  explore  interactions  during  the  screening  stage  and  to  fit  second- 
order  models  via  a  backward  elimination  analysis  strategy  to  each  of  the  [k  —  l)-factor 
projections. 

The  FBBDs  are  developed  by  taking  subsets  of  the  two-level  fractional  factorial 
designs  which  compose  a  BBD  (Edwards  and  Mee,  2011).  The  number  of  runs  asso¬ 
ciated  with  the  FBBD  vary  depending  upon  the  number  of  factors  involved.  While 
FBBDs  require  more  runs  than  CW/XCW/YTL/MA  designs,  their  ease  of  construc¬ 
tion  and  aliasing  structure  facilitate  an  analysis  strategy  which  cannot  be  applied  to 
the  CW/XCW/YTL/MA  designs.  Additionally,  as  k  increases,  the  FBBD  designs 
require  less  runs  than  CCD/BBD. 

Jones  and  Nachtsheim  (2011a)  introduced  a  class  of  three-level  designs  referred  to 
as  “definitive  screening  designs”  where  main  effects  are  not  biased  by  second-order 
effects  and  all  quadratic  effects  are  estimable.  Consisting  of  2k  +  1  runs  for  k  factors, 
these  designs  were  constructing  using  the  same  Jones  and  Nachtsheim  (2011b)  method 
used  by  Edwards  and  Truong  (2011). 

Dougherty  et  al.  (2013a)  describes  a  computer  generated  D— optimality  design 
augmentation  technique  which  uses  a  k— factor  Definitive  Screening  Design  (DSD) 
as  a  baseline  fixed  design  and  augments  the  design  with  k  —  1  additional  runs.  The 
DSD+  focus  on  improving  the  robustness  of  the  DSD  to  the  assumptions  of  heredity 
and  sparsity  and  significant  second-order  factor  identification. 
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5.4  Case  Study 


Arnold  Engineering  and  Development  Center  (AEDC)  provided  Hill  et  al.  (2011) 
the  legacy  wind  tunnel  test  data  set  for  a  21%  scale  model  of  a  system  used  to  simulate 
a  supersonic,  expendable,  low-altitude,  anti-ship  missile.  The  data  set  consisted  of 
approximately  9000  design  points  within  both  the  transonic  and  supersonic  subspace 
test  regions.  Six  design  variables  (Angle  of  Attack,  Roll  Angle,  Elevator  Deflection, 
Aileron  Deflection,  Rudder  Deflection,  and  Mach  Number)  were  used  via  an  OFAT 
testing  methodology  to  aid  in  the  characterization  of  the  overall  aerodynamic  perfor¬ 
mance  of  the  missile  in  both  test  regions. 

Hill  et  al.  (2011)  used  the  9000+  design  points  and  multiple  linear  regression  to 
develop  “ground  truth”  response  surface  models  of  the  missile  system  for  the  two 
partitioned  design  regions.  Equations  5.1  and  5.2  are  the  fitted  regression  models 
representing  this  “ground  truth”  for  the  transonic  and  supersonic  design  regions, 
respectively. 


Yt(CT)  =  0.7276645  -  0.008916X4  +  0.0052574X2  -  0.020997X3  -  0.010612X4 

-  0.035216X5  +  0.6167071Xe  +  0.0104301X1X2  +  0.0877043X,X3 

-  0.011519X^4  -  0.018356X4X5  +  0.0176079X4X6  -  0.017622X2X4 
+  0.0199821X3X4  -  0.137442X4X5  -  0.007419X5X6  +  0.0036692X42 
+  0.1299117X32  +  0.027947X2  +  0.2434671X2  -  0.05499X62 

-  0.370656Xg  (5.1) 
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YS(gt)  =  -0.126954  +  0.0591609X!  +  0.006896X2  -  0.012877X3  -  0.006569X4 

-  0.015172X5  -  0.075103X6  +  0.0065374X1X2  +  0.0561804X,X3 

-  0.00621XiX4  -  0.012867XiX5  +  0.0200284XiX6  -  0.008101X2X4 

+  0.0159264X3X4  -  0.01437X3X5  -  0.0037195X3X6  +  0.009108X2 

+  0.0779585X32  +  0.0174788X42  +  0.1448347X2  -  0.004695X2 

-  0.037185X4  (5.2) 

The  partitioning  of  the  data  into  transonic  and  supersonic  design  regions  enabled  the 
fitting  of  low-order  polynomial  models.  As  such,  Hill  et  al.  (2011)  limited  the  models 
to  full  quadratic  models  and  pure  cubic  terms. 

Hill  et  al.  (2011)  proceeded  to  generate  response  surface  models  for  both  the 
transonic  and  supersonic  design  regions  using  approximately  900  experimental  design 
points  of  two  alternative  designs  (Nested  Face-Centered  Design  (NFCD)  and  an  I- 
optimal  computer  generated  design)  by  sampling  from  the  “ground  truth”  model 
with  an  X(0,  0.0125)  error  added.  They  then  compared  the  corresponding  surfaces 
using  a  Monte  Carlo  sampling  methodology  coupled  with  a  statistical  comparison  to 
determine  the  functional  equivalency  of  the  surfaces.  Hill  et  al.  (2011)  demonstrated 
the  ability  to  generate  equivalent  response  surfaces  at  a  90%  reduction  in  experimental 
effort. 

Whereas  Hill  et  al.  (2011)  were  focused  on  the  ability  to  generate  equivalent  re¬ 
sponse  surfaces  with  a  fraction  of  the  the  original  experimental  runs,  we  are  interested 
in  screening  the  nonlinear  “ground  truth”  models  for  significant  effects  with  a  sin¬ 
gle  18-run  six-factor  augmented  Definitive  Screening  Design  (DSD+)  experiment,  see 
Table  56. 
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Table  56.  Six-Factor  Augmented  Definitive  Screening  Design  (DSD+) 


X1 

^2 

CO 

>c 

AT 

AT 

^A6 

0 

1 

1 

1 

1 

1 

0 

-1 

-1 

-1 

-1 

-1 

1 

0 

-1 

1 

1 

-1 

-1 

0 

1 

-1 

-1 

1 

1 

-1 

0 

-1 

1 

1 

-1 

1 

0 

1 

-1 

-1 

1 

1 

-1 

0 

-1 

1 

-1 

-1 

1 

0 

1 

-1 

1 

1 

1 

-1 

0 

-1 

-1 

-1 

-1 

1 

0 

1 

1 

-1 

1 

1 

-1 

0 

-1 

1 

-1 

-1 

1 

0 

0 

0 

0 

0 

0 

0 

1 

-1 

1 

1 

1 

1 

1 

-1 

-1 

-1 

-1 

-1 

-1 

1 

1 

1 

1 

1 

-1 

1 

1 

-1 

-1 

-1 

-1 

-1 

-1 

1 

-1 

-1 

Table  56  was  generated  by  adding  k  —  1  =  5  runs  to  a  k  —  6-factor  DSD  via  a 
computerized  search  algorithm.  Instead  of  the  information  matrix  being  a  main  effects 
only  model,  the  information  matrix  contains  the  main  effects  and  the  five  two-factor 
interactions  involving  a  particular  factor,  X\  in  this  instance.  The  DSD+  design 
was  constructed  using  a  variant  of  the  coordinate  exchange  algorithm  of  Meyer  and 
Nachtsheim  (1995)  to  maximize  the  determinant  of  the  updated  information  matrix. 
Details  on  the  computer  search  algorithm  employed  for  the  DSD  and  DSD+  can  be 
found  in  Jones  and  Nachtsheim  (2011a)  and  Dougherty  et  ah  (2013a),  respectively. 
To  guard  against  local  maxima,  10000  random  starting  designs,  which  included  the 
baseline  six-factor  DSD,  were  explored.  As  a  result,  twelve  equivalent  designs  were 
generated  based  upon  both  D-optimal  and  I-efficient  criteria.  For  this  case  study  the 
first  of  the  designs  (Table  56)  is  used;  however,  in  practice  other  criteria  could  be 
used  to  further  differentiate  between  the  twelve  designs. 
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5.5  Discussion 


Simulated  responses  where  generated  by  sampling  from  both  “ground  truth”  mod¬ 
els  with  an  iV(0,  0.0125)  error  included  for  the  design  points  specified  by  Table  56. 
Five  sets  of  data  were  simulated  for  each  design  region  (transonic  and  supersonic). 
Second-order  empirical  models  are  then  constructed  through  forward  stepwise  regres¬ 
sion.  With  a  p-value  of  0.1  to  enter,  effects  are  added  into  the  second-order  model 
while  forcing  a  strong  heredity  model.  As  such,  when  either  two-factor  interactions 
or  pure-quadratic  effects  are  included  in  the  model,  the  lower  order  terms  must  also 
be  included. 

Since  the  “ground  truth”  models  contain  all  six  design  variables  at  various  signal 
levels,  screening  design  success  is  based  upon  the  ability  to  determine  active  effects 
based  upon  the  signal-to-noise  ratio.  For  instance  with  a  0.0125  noise  factor,  factors 
X2  and  Xe  have  0.42  and  49.34  signal  to  noise  ratios,  respectively,  within  the  transonic 
region.  As  such,  failing  to  identify  Xq  vice  X2  would  be  a  more  egregious  error.  Table 
57  displays  the  average  percentage  (across  five  replications)  of  effects  identified  at 
four  different  signal-to-noise  ratios. 

Table  57.  Second  Order  Screening  Design  Results  for  Case  Study  :  5  Replication 
Average 


Scenario 

8/e 

Transonic 

Supersonic 

>  0 

>  1 

>  2 

>  3 

>  0 

>  1 

>  2 

>  3 

e  ~  A^(0, 0.0125) 

ME 

90% 

100% 

100% 

100% 

90% 

100% 

100% 

100% 

2FI 

44% 

57% 

50% 

50% 

27% 

36% 

100% 

100% 

-  with  X\ 

36% 

40% 

0% 

0% 

36% 

60% 

100% 

100% 

PQ 

32% 

40% 

40% 

47% 

48% 

67% 

100% 

100% 

Total 

55% 

62% 

58% 

57% 

51% 

65% 

100% 

100% 

Note:  Identified  percentages  correspond  to  percentage  of  Main  Effects  (ME),  Two- 
Factor  Interaction  (2FI)  effects,  and  Pure  Quadratic  (PQ)  effects. 


For  a  signal-to-noise  (8/e)  ratio  >  0,  the  screening  design  is  trying  to  identify  20 
effects  as  being  significant  within  both  Equations  5.1  and  5.2,  excluding  only  the  cubic 
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term.  In  contrast,  for  a  signal-to-noise  {8/e)  ratio  >  2,  the  screening  design  is  trying 
to  identify  8  and  5  effects  within  both  Equations  5.1  and  5.2,  respectively.  Specifically 
for  Equation  5.2,  the  screening  design  is  looking  to  identify  Xi,  Xq,  X1X3,  Xf  and  X\. 

Overall,  with  only  18  design  runs,  the  DSD+  was  able  to  identify  over  half  of 
the  effects  for  both  design  regions.  As  the  signal-to-noise  ratio  increases,  so  does  the 
percentage  of  effects  identified,  particularly  with  regards  to  the  supersonic  region. 
Thereby  signifying  the  design  is  identifying  the  larger  more  significant  terms  for  the 
response  surface.  Most  importantly  the  design  is  nearly  perfect,  even  at  lower  signal- 
to-noise  ratios,  at  identifying  main  effect  (ME)  factors. 

The  DSD+  struggled  at  identifying  active  second-order  effects,  regardless  of  the 
signal-to-noise  ratio,  within  the  transonic  design  region  and  at  the  lowest  signal- 
to-noise  ratio  for  the  supersonic  region.  This  performance  is  not  surprising  giving 
the  large  number  of  “active”  effects  at  smaller  signal-to-noise  ratios  and  the  level  of 
confounding  between  pure  quadratic  and  two-factor  effects.  Screening  designs  stem 
from  the  Pareto  principle  which  states  that  most  of  the  variability  in  a  system  or 
process  output  is  due  to  a  small  number  of  inputs.  This  is  not  the  case  for  the 
transonic  region  where  20  out  of  28  effects  for  a  full  second-order  empirical  model 
using  6  factors  are  deemed  active  or  significant. 

Additionally,  the  DSD+  analysis  methodology  does  not  allow  for  the  inclusion  or 
estimation  of  cubic  terms.  While  both  the  transonic  and  supersonic  “ground  truth” 
models  contain  only  a  single  cubic  term,  the  cubic  term  in  the  transonic  model  is  ten 
times  larger  and  far  more  significant  than  the  cubic  term  in  the  supersonic  model.  As 
such,  the  impact  of  excluding  cubic  terms  in  the  empirical  model  causes  biasing  in 
the  remaining  model  terms  because  the  cubic  terms  effect  will  be  attributed  to  either 
model  error  or  other  effects,  even  potentially  insignificant  effects. 
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5.6  Conclusions 


In  budget  limitations,  testers  need  to  carefully  control  run  size.  Hill  et  al.  (2011) 
succeeded  in  generating  equivalent  response  surface  models  for  both  the  transonic  and 
supersonic  design  regions  at  a  90%  reduction  in  experimental  effort  when  compared 
to  traditional  wind  tunnel  testing  methodology.  But  neither  Hill  et  al.  (2011)  nor  the 
original  wind  tunnel  testers  were  restricted  in  the  number  of  available  experimental 
runs.  While  screening  designs  are  never  perfect,  they  offer  a  mechanism  to  deter¬ 
mine  those  factors  likely  most  active  in  defining  system  response  when  resources  are 
restricted.  At  a  99.8%  and  98%  reduction  in  experimental  effort  when  compared  to 
traditional  wind  tunnel  testing  and  Hill  et  al.  (2011),  respectively,  the  DSD+  was 
able  to  identify  the  majority  of  the  significant  second-order  effects  and  all  but  the 
smallest  main  effect,  particularly  for  the  supersonic  test  region. 

Systems  and  processes  continue  to  become  more  complex,  as  a  result  the  number 
of  factors  being  considered  on  the  system  or  process  response  grows.  In  a  time  when 
resources  are  shrinking,  the  increase  in  factors  of  interest  requires  additional  exper¬ 
imental  runs  if  more  efficient  design  methodologies  are  not  employed.  This  insight 
leads  to  customization  of  the  full  design  in  order  to  yield  better  run  efficiencies;  how¬ 
ever,  there  is  always  a  chance  of  misleading  results.  Such  is  the  case  with  most  any 
statistical  analysis.  Thus,  there  will  always  be  the  need  for  system-specific  expertise 
as  a  complement  to  the  system  experimental  data  analysis.  The  analysis  of  screen¬ 
ing  designs  may  not  be  an  easy  task.  Statistical  proficiency  and  capable  analytical 
packages  will  be  required. 
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VI.  Summary  and  Recommendations 


6.1  Summary  of  Work 

Screening  designs  are  a  category  of  experimental  designs,  usually  performed  during 
the  early  stages  of  a  process  or  system  study,  used  to  determine  which  of  the  many 
factors  (if  any)  have  a  significant  effect  on  the  system  or  process.  Selecting  which 
screening  design  to  use  from  the  multitude  of  available  designs  is  not  always  straight 
forward.  Assumptions  related  to  the  principles  of  sparsity  and  heredity  and  the  form 
of  the  empirical  model  used  to  represent  the  process  or  system  response  should  help 
determine  selection  of  an  appropriate  screening  design.  Factors  such  as  whether  or 
not  follow-up  experiments  are  available  and  the  ease  of  which  statistical  analysis  of 
data  can  be  done  can  also  effect  design  selection.  As  a  result,  with  the  help  of  subject 
matter  experts,  research  and  development  of  specialized  computer  generated  designs 
which  exhibit  desirable  design  parameters  has  increased. 

Second-order  screening  designs  are  a  relatively  new  focus  in  statistical  research  and 
largely  unknown  to  the  defense  test  community.  Second-order  screening  designs  are 
single  experimental  designs  capable  of  preforming  both  factor  screening  and  response 
surface  estimation  when  conducting  multiple  experiments  is  unrealistic  due  to  time, 
budget,  or  other  constraints.  This  dissertation  explored  the  robustness  of  leading 
designs,  developed  an  augmentation  strategy  to  improve  one  of  the  leading  designs, 
and  provided  a  case  study  application  of  such  designs.  Chapter  II  contains  a  detailed 
literature  review  of  screening  and  response  surface  designs,  partitioned  by  sequential 
and  single  phase  methods  for  fitting  first  order  and  second  order  response  surfaces. 
Chapters  III,  IV,  and  V  are  self-contained  research  articles  on  second-order  screening 
designs.  Each  contains  a  literature  review  of  the  research  relevant  to  that  chapter. 


118 


Two  important  principles  used  in  developing  successful  screening  designs  are  spar¬ 
sity  and  heredity.  However,  the  degree  to  which  factor  sparsity  holds  as  the  number 
of  factors  grows  has  resulted  in  a  debate  between  effect  sparsity  and  factor  sparsity. 
Heredity,  either  strong  or  weak,  is  the  second  screening  principle  commonly  used 
when  considering  model  selection.  Strong  heredity  implies  that  if  a  model  includes 
a  two-factor  interaction,  then  its  constituent  main  effects  are  included  in  the  model. 
Conversely,  weak  heredity  requires  only  one  of  the  two  constituent  main  effects  be 
included  in  the  model.  To  date,  evaluation  of  screening  design  performance  has  as¬ 
sumed  both  factor  sparsity  and  strong  effect  heredity.  Chapter  III  formally  examines 
the  robustness  of  the  two  arguably  best  second-order  screening  designs  with  respect 
to  the  assumptions  of  both  sparsity  (factor  or  effect)  and  heredity  (strong  or  weak). 

Whenever  a  screening  design  is  employed,  analytical  tradeoffs  must  be  accepted. 
Definitive  Screening  Designs  are  run  size  efficient  when  strong  heredity  and  factor 
sparsity  are  present  and  when  few  second-order  effects  are  active.  Chapter  IV  de¬ 
scribes  a  computer  generated  D— optimality  design  augmentation  technique  which 
uses  a  /c— factor  Definitive  Screening  Design  (DSD)  as  a  baseline  fixed  design  and 
augments  the  design  with  k  —  1  additional  runs.  In  a  simulation  study,  the  proposed 
augmented  Definitive  Screening  Design  (DSD+)  was  able  to  increase  the  robustness 
of  the  original  DSD  to  the  principles  of  heredity  and  sparsity  while  also  increasing 
the  detection  rate  of  two-order  effects  when  both  two-factor  interactions  and  pure- 
quadratic  effects  are  active. 

Chapter  V  presents  the  use  of  design  of  experiments  and  response  surface  designs 
in  the  area  of  second-order  screening  designs,  particularly  as  applied  to  defense  testing, 
through  demonstrating  the  viable  use  of  second-order  screening  designs  in  a  wind 
tunnel  case  study. 


119 


6.2  Recommendations  for  Future  Research 


This  work  focused  on  the  robustness  and  augmentation  of  existing  second-order 
screening  designs.  Evaluation  of  design  performance  was  based  upon  the  original 
design  authors’  recommended  analysis  methodology.  It  would  be  interesting  to  study 
alternative  analysis  methodology  to  see  if  the  designs  ability  to  identify  active  effects 
can  be  attributed  to  design  structure  or  analysis  methodologies. 

Supersaturated  designs  can  be  used  in  large  screening  experiments  when  the  num¬ 
ber  of  factors  exceeds  the  number  of  available  run.  While  research  on  improving 
supersaturated  designs  construction  and  analysis  continues,  unfortunately,  very  little 
work  is  being  done  on  constructing  supersaturated  designs  which  are  capable  of  re¬ 
sponse  surface  exploration.  It  would  be  interesting  to  study  construction  of  designs 
which  are  supersaturated  in  terms  of  total  number  of  effects  vice  number  of  factors. 
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